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Abstract 


Horton,  Kirk  Gerritt.  Fault  Detection  and  Model  Identification  in  Linear  Dynamical 
Systems.  (Under  the  direction  of  Dr.  Stephen  La  Vern  Campbell.) 

Linear  dynamical  systems,  Ex'  +  Fx  =  f{t),  in  which  E  is  singular,  are  useful  in  a 
wide  variety  of  applications.  Because  of  this  wide  spread  applicability,  much  research 
has  been  done  recently  to  develop  theory  for  the  design  of  linear  dynamical  systems. 
A  key  aspect  of  system  design  is  fault  detection  and  isolation  (FDI).  One  avenue  of 
FDI  is  via  the  multi-model  approach,  in  which  the  parameters  of  the  nominal,  unfailed 
model  of  the  system  are  known,  as  well  as  the  parameters  of  one  or  more  fault  models. 
The  design  goal  is  to  obtain  an  indicator  for  when  a  fault  has  occurred,  and,  when 
more  than  one  type  is  possible,  which  type  of  fault  it  is.  A  choice  that  must  be  made 
in  the  system  design  is  how  to  model  noise.  One  way  is  as  a  bounded  energy  signal. 
This  approach  places  very  few  restrictions  on  the  types  of  noisy  systems  which  can 
be  addressed,  requiring  no  complex  modeling  requirement. 

This  thesis  applies  the  multi-model  approach  to  FDI  in  linear  dynamical  systems, 
modeling  noise  as  bounded  energy  signals.  A  complete  algorithm  is  developed,  re¬ 
quiring  very  little  on-line  computation,  with  which  nearly  perfect  fault  detection  and 
isolation  over  a  finite  horizon  is  attained.  The  algorithm  applies  techniques  to  convert 
complex  system  relationships  into  necessary  and  sufficient  conditions  for  the  solutions 
to  optimal  control  problems.  The  first  such  problem  provides  the  fault  indicator  via 


the  minimum  energy  detection  signal,  while  the  second  problem  provides  for  fault 
isolation  via  the  separating  hyperplane.  The  algorithm  is  implemented  and  tested 
on  a  suite  of  examples  in  commercial  optimization  software.  The  algorithm  is  shown 
to  have  promise  in  nonlinear  problems,  time  varying  problems,  and  certain  types  of 
linear  problems  for  which  existing  theory  is  not  suitable. 
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Chapter  1 


Introduction  and  Review  of  Prior 
Research 


Models  of  dynamical  systems  that  consist  of  a  set  of  linear  differential  and  algebraic 
equations  (DAEs) 

Ez'  +  Fz  =  f{t)  (1.1) 

in  which  the  (square)  matrix  E  is  singular,  are  called  linear  descriptor  systems.  Many 
systems  throughout  a  wide  variety  of  applications  are  most  easily  described  as  linear 
descriptor  systems.  Variational  problems  subject  to  constraints,  such  as  the  equations 
of  motion  for  a  robotic  arm,  can  often  be  written  as  descriptor  systems.  Network 
modeling  problems,  as  in  electrical  circuit  design,  are  another  example.  The  list 
continues  with  model  reduction  problems,  singular  perturbations,  and  discretizations 
of  partial  differential  equations,  just  to  name  a  few.  (See  [5,  8]  for  an  in-depth 
description  of  applications  and  examples.)  Because  of  this  wide  spread  applicability, 
much  research  has  been  done  recently  involving  linear  DAEs. 

A  key  aspect  of  system  design  in  linear  DAE  modeled  systems  is  fault  detection 
and  isolation  (EDI).  One  avenue  of  EDI  is  via  the  multi-model  approach,  in  which 
the  parameters  of  the  nominal,  unfailed  model  of  the  system  are  known,  as  well  as 
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the  parameters  of  one  or  more  fault  models.  The  design  goal  is  to  obtain  an  indicator 
that  tells  the  operator  when  a  fault  has  occurred,  and,  when  more  than  one  type  is 
possible,  which  type  of  fault  it  is. 

Another  aspect  of  system  design  is  the  modeling  of  noise.  One  way  to  model 
noise  is  as  a  bounded  energy  signal.  This  approach  places  very  few  restrictions  on  the 
types  of  noisy  systems  which  can  be  addressed.  It  also  presents  no  complex  modeling 
requirement,  a  very  useful  computational  tool  of  which  we  can  take  full  advantage. 

In  this  thesis  we  apply  the  multi-model  approach  to  FBI  in  linear  descriptor 
systems,  modeling  noise  as  bounded  energy  signals.  The  combination  appears  to 
be  under-explored,  in  that  very  little  research  seems  to  exist  that  uses  both  the 
multi-model  approach  and  bounded  energy  noise.  We  develop  a  complete  algorithm, 
requiring  very  little  on-line  computation  by  an  operator,  with  which  nearly  perfect 
fault  detection  and  isolation  over  a  finite  horizon  is  attained. 


1.1  Linear  Descriptor  Systems 

We  begin  with  a  short  introduction  to  descriptor  systems,  the  basic  theory  and  several 
numerical  methods  used  to  obtain  solutions. 

1.1.1  Basic  Theory 

As  described  above,  DAEs  occur  in  many  applications.  Models  that  consist  of  a  set 
of  ordinary  differential  equations  (ODEs)  often  are  first  written  as  DAEs.  A  DAE 
is  manipulated  through  differentiation  and  substitution  to  convert  it  to  ODE  form. 
Consider 


y 


ax  +  by 
cx  +  d 


(1.2a) 

(1.2b) 
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in  which  a,  b,  c,  and  d  are  scalar  constants.  Equation  (1.2)  consists  of  a  differential 
equation  (1.2a)  and  an  algebraic  constraint  (1.2b).  The  Jacobian  of  (1.2)  with  respect 
to  x',  y'  is 

1  0 
0  0 

which  is  singular.  By  differentiating  (1.2b)  we  obtain  the  full  system 


x'  =  ax  +  by 

(1.3a) 

y  =  cx  +  d 

{1,3b) 

y'  =  ex' . 

(1.3c) 

We  can  substitute  (1.3b)  into  (1.3a)  to  obtain  the  ODE 

x'  =  {a  +  bc)x  +  bd 

(1.4a) 

y'  =  ex' 

(1.4b) 

the  solution  of  which  is  easily  obtained. 

Frequently,  reasons  exist  for  not  attempting  to  manipulate  a  system  like  (1.2)  into 
explicit  (ODE)  form.  First,  physical  problems  initially  modeled  as  DAEs  contain 
relationships  between  variables  of  interest.  Changing  to  an  explicit  model  may  result 
in  less  meaningful  variables,  as  well  as  a  loss  of  the  importance  of  the  relationships 
between  those  variables.  In  addition,  sparsity  is  usually  lost.  Numerical  niethods 
that  rely  on  the  sparsity  of  a  DAE  may  not  be  suitable  for  solving  the  ODE  that 
is  obtained  by  differentiation  and  substitution.  Finally,  it  may  not  be  easy  or  even 
possible  to  convert  a  complex  system  into  ODE  form.  When  it  is  possible,  it  might  be 
easier  to  solve  the  DAE  directly  than  to  do  the  mathematical  manipulation  necessary 
to  convert  it  to  an  ODE.  See  [5]  for  a  more  detailed  explanation  as  to  why  the  DAE 
form  of  a  model  may  be  preferred  over  the  ODE  form. 
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It  is  due  to  these  reasons,  among  others,  that  the  base  of  research  in  DAEs  has 
continuously  grown  over  the  last  several  years.  At  the  heart  of  the  theory  are  two  key 
concepts,  solvability  and  the  uniform  differentiation  index  [5]. 

Definition  1.1.  The  system  (1-1),  where  E  and  F  are  m  x  m  matrices,  is 
solvable  on  an  interval  if  for  every  m-times  differentiable  f(t),  there  is  at  least  one 
continuously  differentiable  solution  to  (1.1).  In  addition,  solutions  are  defined  on  the 
entire  interval  and  are  uniquely  determined  by  their  value  at  any  t  in  the  interval. 

We  will  return  to  the  necessary  and  sufficient  conditions  for  solvability  of  certain 
types  of  DAEs  a  bit  later. 

Definition  1.2.  The  minimum  number  of  times  that  all  or  part  of  (1.1)  must 
be  differentiated  with  respect  to  t  in  order  to  determine  z'  as  a  continuous  function  of 
z,t  is  the  index,  u,  of  DAE  (1.1). 

Example  (1.2)  is  an  index  one  DAE.  Numerical  methods  are  well  developed  for 
index  one  DAEs.  Higher  index  problems  are  notoriously  more  difficult  to  solve  via 
numerical  methods.  Fortunately,  all  of  the  DAEs  we  will  deal  with  in  this  thesis  are 
of  index  one. 

Of  the  several  special  structural  forms  for  DAEs  found  in  the  literature,  two  will 
be  of  interest  in  this  research: 

•  Linear  Time  Invariant  DAE 

Ez'  +  Fz  =  fit)  (1.5) 

•  Linear  Time  Varying  DAE 

Eit)z'  +  Fit)z  =  fit)  (1.6) 

We  mention  three  other  types  for  completeness: 

•  Linear  in  the  derivative,  nonlinear  DAE 


E{z)2'  +F{z)z  =  f{t) 


(1.7) 
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•  Semi-Explicit  (nonlinear)  DAE 

z'  =  f[z{t),u{t),t] 

{1.8a) 

o 

II 

(1.8b) 

•  Fully  Implicit  (nonlinear)  DAE 

II 

o 

(1.9) 

The  extension  of  our  algorithm  to  these  problems  will  be  left  to  future  research. 

The  theory  for  (1.5)  and  (1.6)  is  fairly  well  understood.  For  (1.5),  solvability  is 
expressed  in  terms  of  a  matrix  pencil  For  square  matrices  E  and  F,  and  complex 
parameter  X,  XE  +  F  is  called  a  matrix  pencil.  If  its  determinant  is  not  identically 
zero  as  a  function  of  A,  then  the  pencil  XE  +  F  is  said  to  be  regular.  Equation  (1.5) 
is  solvable  if  and  only  if  XE  +  F  is  a  regular  pencil  [5].  If  (1.5)  is  solvable  we  can  let 
2;  =  Qw  and  premultiply  by  P  so  that  (1.5)  becomes 

PEQw' +  PFQw  =  Pf{t)  =  g{t)  (1.10) 


where  P,  Q  are  nonsingular  matrices  such  that 


PEQ  = 


I  0 
0  N 


,  PFQ  = 


C  0 
0  I 


N  is  a  nilpotent  matrix  the  index  of  which  is  the  same  as  the  uniform  differentiation 
index  of  DAE  (1.5).  The  system  is  then  decoupled,  and  can  be  written  as 


w[  +  Cwi=gi{t)  (l.lla) 

Nw2  +  W2  =  g2{t).  (1.11b) 


Equation  (l.lla)  is  an  ODE  for  which  a  solution  exi.sts  for  any  initial  value  of  ruj  and 
any  continuous  forcing  function  gi{t).  The  unique  solution  to  (1.11b)  is 

i/-i 

W2  =  {ND  +  I)-^g2{t)  =  ^(-l)W'5?(t) 

2=0 
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where  u  is  the  index,  or  degree  of  nilpotency  of  N,  and  D  is  the  differentiation 
operator.  Note  that  the  initial  values  of  W2  are  completely  determined. 

In  the  linear  time-varying  case,  (1.6),  a  similar  result  holds.  While  the  nature  of 
the  matrix  pencil  XE{t)  +  F{t)  is  no  longer  an  indicator  of  solvability,  the  form  of 
(1.11)  is  still  important  in  linear  time- varying  DAEs. 

Definition  1.3.  The  system  (1.6)  is  in  standard  canonical  form  if  it  is  in  the 
form 


[0  N{t) 

where  N  is  strictly  lower  ( or  upper)  triangular. 

If  E{t),  F{t)  are  real  analytic,  then  (1.6)  is  solvable  if  and  only  if,  after  linear 
time- varying  coordinate  changes,  it  can  be  written  as  (1.12).  The  problem  exists  in 
the  difficulty  of  finding  those  coordinate  changes  that  allow  us  to  write  the  DAE  in 
standard  canonical  form. 

For  the  other  three  cases  mentioned  above,  (1.7)-(1.9),  the  theory  is  much  newer 
and  also  much  less  understood.  For  the  purpose  of  this  thesis,  it  is  sufficient  to  note 
that  the  concepts  presented  above  serve  as  a  basis  for  the  development  of  the  theory 
for  these  cases.  It  should  be  noted  that  while  this  newer  theory  is  beyond  the  scope  of 
this  thesis,  the  common  starting  point  serves  as  a  good  indicator  that  the  algorithm 
developed  herein  for  linear  time-invariant  and  linear  time-varying  DAEs  may  have 
applications  in  the  more  general  case  as  well. 

While  there  are  many  more  interesting  and  useful  items  in  the  theory  of  DAEs, 
these  few  properties  and  definitions  that  we  have  mentioned  will  suffice  for  our  pur¬ 
poses.  For  a  complete  treatment  of  DAEs  see  [5,  7,  8,  9,  10,  13,  14]. 


C{t)  0 
0  I 


2  =  f(t) 


(1.12) 
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1.1.2  Numerical  Solutions 

While  it  is  not  our  goal  to  present  an  exhaustive  overview  of  the  numerical  meth¬ 
ods  that  can  be  used  to  solve  descriptor  systems,  we  briefly  mention  those  methods 
which  will  be  used  in  later  chapters.  The  discretization  methods  we  will  review  are 
those  that  are  used  by  the  commercial  software  in  which  we  implement  the  FDI  al¬ 
gorithm  developed  in  this  thesis,  namely  the  trapezoidal  method,  the  Compressed 
Hermite-Simpson  method,  and  the  4-stage  implicit  Runge-Kutta  method.  These  di¬ 
rect  transcription  discretizations  will  be  described  using  the  semi-explicit  DAE  form 
(1.8).  After  discretization,  several  methods  exist  for  solving  the  resulting  finite  di¬ 
mensional  problem.  Of  those  methods,  only  the  sparse  quadratic  program  (SQP)  will 
be  described  here.  While  the  software,  Boeing’s  Sparse  Optimal  Control  Software 
(SOCS),  which  will  be  introduced  in  a  later  chapter,  can  solve  DAEs  via  an  ana¬ 
lytic  transformation,  as  well  as  Eider’s  and  linear  multistep  methods,  these  schemes 
will  not  be  used  in  this  thesis,  and  thus  will  not  be  mentioned  here.  Many  of  the 
approaches  applied  to  DAEs  are  described  in  detail  in  [5,  13]. 

For  our  discussion  of  discretization  and  finite  dimensional  problem  solution,  con¬ 
sider  a  simple  optimization  problem  based  on  the  semi-explicit  DAE  (1.8) 

min  J[x{t),u{t),t]  (1.13a) 


subject  to 


x'  =  f[x{t),u{t),t] 
0  =  g[x{t),u{t),t]. 


(1.13b) 

(1.13c) 
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Discretization 

In  general,  transcription  discretization  schemes  start  by  dividing  the  time  interval, 
[to,tf],  into  n  segments 

to  <ti  <t2  <  ...  <tn  =  tf 

where  the  points  4,  k  =  0,  ...n,  are  referred  to  as  mesh  points.  Let  Xk  =  x{tk)  be 
the  value  of  a  state  variable  at  a  mesh  point.  Likewise,  denote  the  value  of  a  control 
at  a  mesh  point  as  Uk  —  u{tk).  Let  fk  =  f{xk,Uk,tk)  and  Qk  =  g{xk,Uk,tk)  be  the 
right-hand  sides  of  (1.13b)  and  (1.13c),  respectively.  Finally,  let  hk  —  4  -4-i  be  the 
step  size  for  k  =  l,...,n. 

Utilizing  this  notation,  the  trapezoidal  method  approximates  the  state  equations 
(1.13b)  and  algebraic  constraints  (1.13c)  as 

^k  =  ^k-l  + -^{fk  +  fk-i)  (1.14a) 

0  =  9k.  (l-14b) 

In  the  Compressed  Hermite-Simpson  scheme  we  denote  the  value  of  the  control  at 
the  midpoint  of  a  segment  as  Uk  =  m(4)  where  4  =  |(4  +  4-i))  for  k  =  I,  ...,n.  The 
discrete  approximations  for  this  method  are  given  by 

Xk  =  Xk-I  +  -^{fk  +  k 'h  fk-l)  (1.15a) 

0  =  gk  (l-15b) 

where 

7  k  =  f{^k,u,ik) 

with 
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for  k  —  1,  The  4-stage  implicit  Runge-Kutta  discretization  uses  four  intermedi¬ 
ate,  implicit  steps 


Cl  — 

(1.16a) 

C2  =  hkf{Xh-\  +  -^,Uk,tk) 

(1.16b) 

C3  =  hkf{xk-i  +  -^,Uk,ik) 

(1.16c) 

C4  =  hkf(X)~—\  C^iUkitk)- 

(1.16d) 

The  discrete  approximations  for  this  method  then  become 

Xk  =  Xk-l  + -{ci -\- 2C2  +  2C3  +  C4) 

0 

(1.17a) 

0  =  gk 

(1.17b) 

where  Uk  is  defined  as  before. 

These  methods  have  all  been  proven  to  converge  for  index  one  DAEs,  and  are 
thus  appropriate  for  our  purpose  [4,  5].  In  every  case,  the  result  of  the  discretization, 
when  combined  with  the  cost  function,  J,  is  a  sparse  nonlinear  programming  (NLP) 
problem.  The  variables  of  the  problem  are  the  discretized  states,  x^,  controls,  Uf.,  and 
time,  tjt,  for  k  =  0,  ...n. 

Solving  the  Finite  Dimensional  Problem 

One  way  to  solve  this  NLP  problem,  and  the  approach  used  by  SOCS,  is  via  a  SQP 
approach  [3].  Dropping  subscripts  for  now,  let  w  be  the  vector  of  state  and  control 
variables,  {x,  u),  and  let  F{w,t)  be  the  constraint  set  resulting  from  the  di.scretization 
of  the  DAE.  That  is,  (1-14),  (1.15),  or  (1.17),  after  shifting  everything  to  the  right 
hand  side,  becomes 


0  =  F{w,  t). 
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Note  that  is  a  function  of  the  state,  control,  and  time  variables  at  all  time  steps. 
The  SQP  algorithm  requires  an  initial  guess,  w^,  and  forms  a  new  iterate  by  adding 
a  scalar  multiple,  a,  of  the  search  direction,  p.  That  is, 

+  ap. 

The  search  direction  is  found  by  solving  a  quadratic  programming  (QP)  subproblem 
defined  at  the  current  point.  The  QP  subproblem  is  defined  as 

min  JIp  +  ^p^Hp 

subject  to 

0  =  Gp 

where  Jyj  is  the  gradient  vector  of  the  cost  function,  H  is  an  approximation  to  the 
Hessian  matrix  of  the  Lagrangian  of  the  NLP  (L  =  J  —  X'^F),  and  G  is  the  Jacobian 
matrix  of  gradients  of  the  constraints  F.  The  step  length,  a,  is  computed  such  that 
H  remains  positive  definite.  The  QP  subproblem  can  be  solved  via  either  a  sparse 
Schur- Complement  method,  when  appropriate,  or  a  null-space  quadratic  program¬ 
ming  algorithm  when  G  and/or  H  are  dense.  Details  about  the  latter  can  be  found 
in  [3]. 

An  algorithm  based  on  the  combination  of  a  direct  transcription  scheme  and  the 
SQP  approach  begins  with  a  discretization  and  an  initial  guess.  The  SQP  problem  is 
then  solved  via  the  QP  subproblem  iteration.  After  each  QP  subproblem  is  solved, 
the  current  point  is  updated  and  the  procedure  is  repeated.  The  subproblem  itera¬ 
tion  terminates  when  a  point  is  reached  which  satisfies  necessary  conditions  for  a  local 
minimum  within  a  given  set  of  tolerances.  The  solution  is  then  compared  to  that  of 
the  previous  discretization  iteration,  or  the  initial  guess  if  it  is  the  first  iterate.  The 
mesh  is  refined,  the  problem  re-discretized  and  the  process  is  repeated  until  succes¬ 
sive  iterates  agree  within  an  additional  given  set  of  tolerances.  The  QP  subproblem 
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demonstrates  quadratic  convergence  under  the  right  conditions  [3].  Convergence  rates 
for  the  direct  transcription  schemes,  when  applied  to  index  one  descriptor  systems, 
are  at  least  quadratic,  and,  under  the  right  system  coefficient  conditions,  often  are 
considerably  better  [5]. 

1.2  Fault  Detection  and  Isolation 

With  this  basic  understanding  of  the  theory  of  descriptor  systems  and  numerical 
methods  for  their  treatment,  we  now  turn  our  attention  to  the  various  approaches  for 
treating  faults  in  those  systems.  We  begin  with  basic  control  theory,  and  then  turn 
to  feedback,  the  link  between  control  theory  and  FDI.  Following  that  is  a  discussion 
of  the  elements  of  optimal  control  and  Hoc  control  pertinent  to  our  approach.  We 
conclude  with  a  discussion  of  existing  research  into  FDI  in  descriptor  systems  and 
the  methods  used. 

1.2.1  Basic  Theory 

A  descriptor  system  is  one  possible  result  of  a  system  design  problem.  The  problem 
begins  with  a  task  to  be  accomplished,  and  the  design  engineer  is  usually  given  goals 
or  objectives  that  describe  the  desired  performance  characteristics  of  the  system  along 
with  a  set  of  constraints  by  which  the  system  is  bound.  The  development  of  a  system 
which  accomplishes  the  objectives  while  meeting  the  constraints  is  the  system  design 
problem. 

A  particular  type  of  system  design  problem  is  the  control  problem,  in  which  the 
goal  is  to  generate  certain  outputs  from  the  system  or  to  maintain  the  state  of  the 
system  within  certain  bounds.  For  example,  an  engineer  might  be  asked  to  design 
a  satellite  attitude  control  system  which  does  not  consume  too  much  fuel  [1].  The 
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essential  elements  of  such  a  control  problem  are 

•  a  mathematical  model  of  the  system, 

•  a  desired  output, 

•  a  set  of  admissible  controls, 

•  a  performance  measure. 

Often,  as  stated  above,  a  descriptor  system  is  the  natural  product  of  the  system 
design  problem.  For  the  remainder  of  this  thesis,  we  will  restrict  most  of  our  study 
to  linear  time-invariant  systems  (1.5).  Comments  extending  our  algorithm  to  linear 
time-varying  systems  (1.6)  are  included  in  a  later  chapter. 

Consider  a  system  based  on  the  linear  time  invariant  DAE  (1.5) 

x'  =  Ax  +  Bu  (1.18a) 

y  =  Cx  (1.18b) 

where  x,  y,  and  u  are  the  state,  output,  and  control  vectors,  respectively,  and  the  time 
interval  considered  is  t  e  [to,tf]-  Systems  often  allow  for  noise  or  unknown  inputs  by 
adding  a  term  to  each  equation  of  (1.18) 

x'  =  Ax  +  Bu  +  Mp,  (1.19a) 

y  =  Cx  +  Np  (1.19b) 

where  p  is  the  noise  or  unknown  input,  and  the  matrices  M  and  N  are  the  weight 
matrices  for  the  state  and  output  noise,  respectively. 

Central  to  the  study  of  system  (1.18)  are  the  concepts  of  controllability,  observ¬ 
ability,  and  stability  [6]. 

Definition  1.4.  A  linear  system  is  said  to  be  controllable  at  to  if  it  is  possible  to 
find  some  input  function  u{t),  defined  over  t  e  [to,tf],  which  will  transfer  the  initial 
state  x{to)  to  the  origin  at  some  finite  time  ti  €  [to,  tf],  ti  >  to-  U  ^his  is  true  for  all 
initial  times  to  and  all  initial  states  x{to),  the  system  is  completely  controllable. 
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Definition  1.5.  A  linear  system  is  said  to  be  observable  at  to  if  x{to)  ca,n  be 
determined  from  the  output  function  y[to,u]  for  to  G  [to,tf]  o.nd  to  <  ti,  where  ti 
is  some  finite  time,  ti  E  [to,tf].  If  this  is  true  for  all  to  and  x{to),  the  system,  is 
completely  observable. 

Since  controllability  describes  the  ability  of  the  control  to  affect  the  system  state, 
it  involves  the  matrices  A  and  B.  Likewise,  since  observability  describes  the  ability  of 
the  output  to  characterize  the  state,  it  involves  the  matrices  A  and  C.  Simply  stated, 
the  nth-order  system  (1.18)  is  controllable  if  and  only  if  [si  —  A  \  B]  has  rank  n  for 
all  values  of  s.  The  same  system  is  observable  if  and  only  if  [si  —  A'^  [  C^]  has  rank 
n  for  all  values  of  s.  Proofs  of  these  characteristics  can  be  found  in  [6],  along  with 
the  requirements  for  controllability  and  observability  in  more  complex  systems. 

The  concept  of  stability  helps  us  deal  with  systems  that  may  not  be  controllable 
and/or  observable.  Stability  is  closely  related  to  the  eigenvalues  of  the  A  matrix. 
Intuitively,  a  solution  to  (1.18)  is  stable  if  we  can  stay  close  to  the  solution  by  start¬ 
ing  close  enough  to  it  via  the  initial  condition.  A  solution  is  asymptotically  stable  if, 
by  starting  close  enough,  we  converge  to  the  solution.  A  system  is  stabilizahle  if  all 
unstable  modes  are  controllable,  and  detectable  if  all  unstable  modes  are  observable. 
Thus  the  system  can  be  handled  effectively  provided  all  uncontrollable  and  unobserv¬ 
able  modes  are  stable.  This  situation  can  often  be  tolerated  in  a  control  system  [6]. 
For  the  remainder  of  this  thesis,  we  will  a.ssume  that  we  are  dealing  only  with  the 
controllable  and/or  observable  modes  of  control  systems. 

1.2.2  Feedback  and  Observer  Design 

The  bridge  between  basic  control  theory  and  fault  detection  is  the  concept  of  feedback. 
In  a  feedback  control  system,  the  control,  u{t),  is  modified  by  information  about  the 
system.  Sensors  measure  either  the  system  state,  or  the  output,  and  then  pass  that 
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information  to  the  controller,  which  adjusts  the  control  based  on  the  input  from 
the  sensors.  One  of  the  fundamental  goals  of  feedback  compensator  design  is  to 
improve  the  performance  of  the  system  through  eigenvalue  placement.  As  stated 
earlier,  stability  depends  on  the  eigenvalues  of  the  A  matrix.  By  assigning  desirable 
values  to  eigenvalues,  system  stability  can  be  enhanced.  For  the  state  feedback  case, 
the  relation 

u(t)  =  Fv(t)  -  Kx{t)  (1.20) 

is  used,  where  the  matrix  K  is  called  the  feedback  gain  matrix,  and  F  the  feed-forward 
matrix.  Substituting  into  (1.18),  we  obtain 

x'  =  {A-BK)xPBFv  (1.21a) 

y  =  Cx.  (1.21b) 

Clearly,  the  eigenvalues  of  the  A  -  BK  matrix  now  determine  the  stability  of  the 
system.  By  careful  construction  of  the  feedback  gain  matrix  K,  the  eigenvalues  are 
assigned  the  desired  values.  For  the  output  feedback  case,  the  relation 

u{t)  =  Fvit)  -  Ky{t)  (1.22) 

is  used,  where  the  K  and  F  matrices  are  as  defined  above.  Substituting  this  relation 
into  (1.18),  we  obtain 

x'  =  {A-  BKC)x  +  BFv.  (1.23) 

Now  the  eigenvalues  of  the  A  -  BKC  matrix  determine  the  stability  of  the  system. 
Unfortunately,  due  to  the  presence  of  the  C  matrix  in  this  expression,  output  feedback 
usually  cannot  place  all  of  the  eigenvalues  of  the  system.  This  limitation  is  present 
when  the  rank  of  KC  is  less  than  the  rank  of  K,  i.e.,  when  C  is  a  long  matrix.  It 
should  be  noted  that  state  feedback  may  impact  the  observability  of  a  system,  but  can 
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have  no  impact  on  controllability.  Output  feedback  can  impact  neither  controllability 
nor  observability  of  a  system  [6]. 

Using  feedback,  the  basic  tool  for  many  FDI  approaches  can  be  constructed;  the 
observer.  For  most  systems  the  only  information  about  the  system  state  is  through 
the  output  vector,  which  often  provides  only  partial  information.  Thus,  output  feed¬ 
back  is  the  only  option,  and  not  all  system  eigenvalues  can  be  placed  where  desired. 
To  improve  system  stability  in  these  cases,  the  most  frequently  used  method  is  to 
reconstruct  information  about  the  remaining  elements  of  the  state  vector  through 
development  of  an  observer  of  the  system.  Consider  the  observer 

x'^Ax  +  Bu  +  L{y-Cx)  (1.24) 

where  x  is  the  observer  estimate  for  the  state  vector.  Note  that  y  is  the  output  from 
the  real  system,  (1.21),  and  Cx  is  the  observer  output.  Subtracting  (1.18a),  and 
letting  e  —  X  —  xhe  the  observer  error,  we  obtain 

e'  ^  {A-  LC)e. 

Since  L  is  arbitrary  and  {A,  C)  is  observable,  we  can  guarantee  that  observer  error 
goes  to  zero  by  selecting  L  so  that  A  -  LC  is  stable.  With  this  construction,  state 
feedback  is  possible  using  the  observer  estimate  for  the  state  vector.  Thus  all  system 
eigenvalues  can  be  placed  where  desired,  and  complete  control  over  system  stability  is 
possible.  It  should  be  noted  that  since  the  complete  state  vector  is  reconstructed  by 
the  observer,  faults  whieh  send  the  system  into  unpredicted  or  undesirable  states  may 
be  detectable  by  such  an  observer  simply  by  comparing  the  observer  estimate  with 
those  elements  of  the  system  state  vector  which  are  available.  This  fault  detection 
can  be  accomplished  without  using  the  observer  to  affect  any  feedback  compensator 
for  the  system. 
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1.2.3  Optimal  Control 

Later,  when  we  develop  our  algorithm,  we  will  work  with  a  control  system  which  acts 
as  the  constraints  in  an  optimization  problem.  This  optimal  control  structure  is  key  to 
the  multi-model  approach  to  FDI,  which  we  will  discuss  in  Section  1.2.5.  Accordingly, 
we  briefly  review  optimal  control  theory.  While  this  area  of  study  is  vast,  the  only 
topic  which  we  will  need  for  our  discussion  is  the  state  regulator  problem,  also  called 
the  linear  quadratic  regulator  (LQR)  problem.  Consider  the  optimization  problem 

x^Qx  +  uFRu  dt  (1.25a) 

subject  to 

x' =  Ax  +  Bu  (1.25b) 

as  well  as  some  initial  conditions  at  the  beginning  of  the  interval,  where  Sf,  Q,  and 
R  are  the  weight  matrices  for  the  terminal  cost,  the  trajectory,  and  the  control.  It  is 
assumed  that  Q  is  positive  semi-definite  and  R  is  positive  definite.  This  is  one  form 
of  the  LQR  problem  and  it  is  important  for  three  reasons.  First,  the  theory  is  elegant 
and  robust.  Results  are  easy  to  understand  and  implement  in  numerical  algorithms. 
Second,  it  has  strong  geometry.  J{x,  u)  is  actually  an  inner  product  norm  with  useful 
properties.  Finally,  there  are  strong  physical  correlations  to  this  type  of  problem. 
Energy  is  a  quadratic  form,  as  is  power. 

As  with  any  optimization  problem,  the  LQR  problem  possesses  necessary  condi¬ 
tions  for  a  minimum.  For  the  problem  (1.25),  we  construct  the  Hamiltonian 

H{x,  A,  u)  =  i  {x^Qx  +  u'^Ru)  +  X^{Ax  P  Bu)  (1.26) 

where  A  is  the  Lagrange  multiplier  for  the  constraints.  The  Euler  equations,  which 


1  1  /" 
J{x,u)  =  min  -x{tff Sfx{tf)  +  -J^ 
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must  be  satisfied  by  any  extremum  of  the  problem,  are 


II 

(1.27a) 

II 

1 

(1.27b) 

Hi  =  0. 

(1.27c) 

When  applied  to  (1.26),  we  obtain 

A' 

=  —Qx  —  A^A 

(1.28a) 

x' 

=  Ax  +  Bu 

(1.28b) 

0 

=  Ru  +  B^A. 

(1.28c) 

Using  (1.28c)  and  our  assumption  that  R  >  0,  we  can  eliminate  u  from  (1.28)  to 
obtain  a  set  of  differential  equations  in  x  and  A 

x'  =  Ax-BR-^B^A  (1.29a) 

A'  =  -Qx-A^A.  (1.29b) 

While  this  form  will  be  useful  in  our  algorithm,  it  is  possible  to  take  an  additional  step 
and  eliminate  A,  resulting  in  a  matrix  Riccati  differential  equation  for  the  optimal 
control  feedback  gain  matrix.  The  derivation  of  the  Riccati  equation  will  be  detailed 
when  we  develop  our  algorithm  in  the  next  chapter.  It  should  be  noted  that  our 
assumptions  on  the  Q  and  R  matrix,  while  not  restrictive  in  an  applicability  sense, 
guarantee  that  the  extremum  which  satisfies  the  necessary  conditions  represents  at 
least  a  local  minimum  of  the  cost  J{x,u).  In  fact,  Q  is  often  positive  definite,  and 
in  that  case,  the  conditions  for  an  extremum  are  necessary  and  sufficient.  Detailed 
discussions  of  this  and  other  topics  in  optimal  control  can  be  found  in  [1,  6,  38]. 
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1.2.4  Hoo  Control 

Hoo  control  in  the  time  domain  is  similar  to  optimal  control.  It  takes  advantage  of 
the  linear  quadratic  (LQ)  form  in  addressing  significant  uncertainties  in  the  energies 
of  system  noises.  For  bounded  energy  noise  inputs,  where  little  or  no  other  knowledge 
is  available  about  the  signal,  the  LQR  formulation  is  an  elegant  worst-case  approach. 
The  model  generally  takes  the  form  of  (1.19),  and  all  functions  are  assumed  to  exist  in 
the  space  of  square  integrable  functions,  denoted  While  Hoo  performance  criteria 
vary,  they  all  share  the  structure  of  the  optimal  control  cost  function,  that  is,  they 
are  all  in  LQ  form. 

In  this  setting,  filtering,  smoothing,  and  compensator  design  are  efficiently  ac¬ 
complished.  Nagpal  and  Khargonekar  [27]  apply  a  filtering  and  smoothing  method 
using  an  Hoo  performance  criterion  on  both  the  finite  and  half-infinite  intervals  to 
accomplish  state  estimation  (filtering)  and  output  smoothing.  Tadmor  [36]  attempts 
to  find,  in  LQ  game-theoretic  terms,  the  compensator  which  provides  the  best  control 
in  response  to  the  worst  disturbance.  Matrix  Riccati  equations  provide  solutions  in 
each  case. 

While  the  structure  of  our  problem  is  very  similar  to  the  Hoo  problem,  several 
key  differences  will  become  apparent.  First,  we  will  solve  a  different  problem.  While 
Tadmor  [36]  designs  a  worst-case  compensator,  and  Nagpal  and  Khargonekar  [27] 
solve  for  the  optimal  filter  and  smoother  in  the  face  of  various  initial  conditions,  we 
will  solve  for  the  optimal  fault  detection  signal.  In  addition,  while  both  [27]  and 
[36]  work  in  single  model  systems,  we  will  work  in  a  multi-model  system.  Finally, 
while  the  noise  present  in  our  system  is  also  L^,  it  is  not  the  same  kind  of  signal  as 
is  commonly  assumed  in  Hoo  control.  The  impact  of  these  differences  will  become 
significant  as  our  problem  is  defined  and  our  algorithm  developed. 
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1.2.5  Prior  Research 

In  addition  to  the  two  approaches  mentioned  above,  fault  detection  and  isolation  in 
linear  control  systems  has  been  attempted  from  many  angles.  To  begin,  we  note  that 
there  exist  two  basic  types  of  approaches  to  FDI:  passive  and  active.  In  the  passive 
approach,  only  monitoring  of  system  performance  is  allowed.  No  interaction  with  the 
system  occurs,  either  for  material  or  security  reasons.  The  system  states  (or  otitputs) 
are  measured  and  compared  to  “normal”  system  behavior,  generating  a  residual.  The 
residual  is  computed  such  that  it  is  equal  or  close  to  zero  when  no  faults  are  present, 
and  much  different  from  zero  when  a  fault  occurs.  The  vast  majority  of  research  in 
FDI  using  the  passive  approach  applies  observers  to  generate  residuals. 

Passive  Methods 

Nuninger  et  al.  [32]  use  analytic  redundancy  in  order  to  detect  sensor  and  actuator 
failures  or  process  disturbances.  Analytic  redundancy  attempts  to  generate  a  residual 
that  might  contain  information  about  the  faults.  Two  methods  for  generating  the 
residual  are  examined.  First,  direct  residual  generation  is  based  on  the  parity  space 
approach,  using  the  input-output  transfer  function  (the  parity  equation).  Second, 
indirect  residual  generation  is  based  on  output  estimation,  using  an  observer  to  do 
state  estimation  first.  The  authors  apply  the  first  method  to  known  input  systems 
and  the  second  method  to  both  known  and  unknown  input  systems.  A  drawbaek  of 
this  approach  is  that  some  faults  may  have  no  influence  on  the  residuals  generated  by 
either  method,  so  perfect  FDI  is  not  attained.  Chen  and  Speyer  [15]  also  use  analytic 
redundancy,  generating  a  residual  via  an  observer  that  reconstructs  the  system  state 
vector.  Their  model  has  the  target  fault  direction  explicitly  in  the  detection  filter 
gain  calculations,  allowing  for  enhanced  sensitivity  of  the  filter  to  the  target  fault. 
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Koenig  et  al.  [24]  present  a  comparative  study  of  several  design  methods  for  un¬ 
known  input  observers  (UIO)  used  for  FDI  and  Correction.  Their  goal  is  to  design 
an  integrated  approach  which  can  detect,  isolate,  and  correct  a  large  variety  of  faults 
for  a  desired  system  with  real-time  computation  constraints.  Methods  compared  are: 
failure  isolation  by  using  banks  of  observers  (robust  to  some  faults,  but  sensitive  to 
others,  in  combination  so  that  all  faults  are  detectable),  failure  isolation  by  observer 
pole  placement  (to  create  an  unknown  input  fault  detection  observer),  and  failure 
correction  via  general  structured  UIO  (design  of  full  order  observer  to  estimate  states 
as  well  as  unknown  inputs).  Chung  and  Speyer  [17]  develop  a  game  theoretic  detec¬ 
tion  filter,  which  is  similar  to  the  UIO,  that  attenuates  disturbances,  bounding  all 
signals  except  the  fault  to  be  detected,  embedding  the  exogenous  signals  into  an  un¬ 
observable,  invariant  subspace.  The  subspace  structure  is  used  to  reduce  the  order  of 
the  limiting  filter  by  factoring  the  invariant  subspace  out  of  the  state  space,  resulting 
in  a  lower  order  filter  sensitive  only  to  the  fault  to  be  detected.  The  filter  is  applied 
to  the  flight  control  characteristics  of  the  F-16XL  and  a  simple  rocket. 

The  parity  relation  approach  to  residual  genertion  is  applied  by  Youssouf  and 
Kinnaert  [40].  The  method  is  based  on  the  inverse  of  the  map  from  both  unknown 
inputs  and  faults  to  the  observable  signals  (measured  inputs  and  outputs),  using  a 
variable  change  in  the  frequency  domain.  Tools  available  for  nonsingular  systems  can 
be  used  on  the  resulting  map.  The  authors  contend  that  FDI  for  singular  systems 
depended  previously  on  state  estimation,  which  put  unnecessary  requirements  on 
the  plant,  as  there  is  no  need  to  reconstruct  the  entire  state  vector  to  do  residual 
generation.  Where  Youssouf  and  Kinnaert  [40]  apply  their  method  to  continuous  time 
systems,  Sauter  et  al.  [35]  do  the  same  for  the  discrete  case  in  mechanical  systems, 
though  the  theory  and  algorithm  are  completely  different.  They  do  state  equation 
decoupling  before  residual  generation,  which  is  contrary  to  customary  methods.  The 
decoupling  involves  separating  out  the  unknown  input  from  the  system  state  instead 
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of  from  the  residual. 

Chowdhury  and  Aravena  [16]  go  in  a  slightly  different  direction.  They  apply  a 
modular  methodology  to  the  area  of  fast  fault  detection  and  classification  in  dynamical 
electrical  power  systems.  Module  I  is  the  generation  of  fault  indicators  in  one  of  two 
ways: 

1.  model-based,  in  which  a  residual  is  generated  using  either  an  accurate  mathe¬ 
matical  or  I/O  model  of  the  nominal  system,  or  an  I/O  model  is  built  on-line, 
which  is  very  difficult, 

2.  model-free,  in  which  detectable  variables  are  measured  and  enhanced  if  neces¬ 
sary  by  signal-processing  techniques. 

The  authors  present  a  model-free  orthogonal  decomposition  based  on  multirate  filter 
banks  to  produce  a  fault  indicator.  Module  II  is  the  measuring  and  testing  of  fault  in¬ 
dicators  via  either  statistical  test  or  feedforward  neural-network  testing.  The  authors 
explore  the  neural  network  aspect.  Fault  classification  occurs  in  module  III,  another 
neural  network,  the  operation  of  which  depends  on  the  existence  of  a  system  model. 
The  emphasis  is  on  model-free  methods,  those  lesser  explored  and  lesser  restrictive 
cases  where  models  are  not  available,  non-linearities  prevent  model  derivation,  or  too 
many  uncertainties  exist  in  the  system.  These  cases  appear  to  hold  the  most  promise 
for  neural-network  applications  in  fault  detection. 

Hybrid  Passive- Active  Methods 

Some  research  has  been  done  using  a  hybrid  of  the  passive/active  approaches.  The 
passive  approach  is  used  to  detect  faults,  then  an  active  approach  is  used  to  cor¬ 
rect  or  compensate  for  faults  through  feedback.  Zhang  and  Jiang  [43]  investigate 
the  application  of  integrated  fault  detection,  diagnosis,  and  reconfigurable  control 
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to  discrete-time  stochastic  vertical  take-off  and  landing  aircraft  systems.  A  bank  of 
two-stage  adaptive  Kalman  filters  is  used  for  FDI,  and  statistical  decisions  are  made 
for  fault  detection,  diagnosis  and  activation  of  controller  reconfiguration.  In  a  related 
paper  [42],  the  same  authors  apply  an  interacting  multiple-model  based  approach  to 
the  same  type  of  control  problem.  A  finite-state  Markov  chain  is  linked  to  the  same 
Kalman  filter  bank  for  fault  diagnosis.  The  decision  from  this  process  is  used  to 
activate  system  reconfiguration  via  eigenstructure  assignment. 

Active  Methods 

The  drawback  inherent  in  the  passive  approach  is  that  faults  can  be  masked  by  the 
application  of  the  control.  Thus  it  is  possible  that  a  fault  could  go  undiscovered  until 
it  is  too  late  to  correct  it.  In  direct  contrast,  the  active  approach  interacts  with  the 
system  on  a  periodic  basis,  or  at  critical  times,  to  detect  faults,  thus  eliminating  the 
possibility  of  the  presence  of  undetected  faults.  The  approach  uses  various  types  of 
interaction  with  the  system  to  detect  faults.  A  test  signal,  which  is  constructed  in 
such  a  way  that  faults  are  highlighted,  is  fed  into  the  system.  Observation  and/or 
manipulation  of  the  resulting  output  is  used  to  make  a  decision  about  system  faults. 
Observers  designed  to  aid  in  feedback,  as  well  as  various  other  types  of  feedback 
compensators,  are  examples  of  the  active  approach. 

Bennett  et  al.  [2]  apply  speed  dependent  feedback  (a  stable  time- varying  linear 
observer)  to  detect  intermittent,  short  duration  faults  in  bilinear  dynamical  systems. 
The  AC  drive  system  for  an  electric  train  is  considered.  These  systems  experience 
abrupt  disconnections  which  introduce  severe  transient  errors  and  which  are  hard  to 
detect  due  to  their  short  durations.  The  parity  equation  approach  is  not  preferred  in 
this  case  due  to  the  intermittent  nature  of  the  faults.  By  combining  the  observer  and 
Kirchoff’s  law,  a  bank  of  observers  is  constructed  to  detect  and  correct  disconnections. 
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This  case  is  an  example  of  the  application  of  a  test  signal  as  part  of  the  feedback  to 
control  the  system  and  correct  faults. 

The  multi-model  approach  is  well-suited  to  the  case  where  it  is  desirable  to  apply 
a  test  signal  independent  of  feedback  control.  The  approach  relies  on  the  presence  of 
the  system  model 

x\  —  AiXi  +  BiV  +  MiPi  (1.30a) 

y  =  CiXi  +  NiPi  (1.30b) 

for  i  =  0,  where  m  is  the  number  of  faults  expected  from  the  system,  and 
V  is  the  test  signal.  A  different  system  model  exists,  with  known  parameters,  for 
each  possible  fault.  It  is  assumed  that  any  feedback  control  has  been  absorbed  into 
the  Ai  matrix,  thus  eliminating  the  control  u  from  the  differential  equation.  The 
difficulties  in  this  approach  lie  in  determining  from  which  model  an  output  y  derives, 
as  well  as  the  computation  of  system  parameters  for  each  fault  model.  Nikoukhah 
[28]  presents  the  use  of  a  test  signal  for  active  FDI  in  discrete-time  linear  systems 
subject  to  inequality-bounded  perturbations.  Detectability  is  required,  but  when 
present,  guaranteed  FDI  is  attained.  The  discrete  time  case  lends  itself  to  recursive 
algorithms,  and  so  recursion  is  used  extensively  by  the  author  to  develop  the  test 
signal.  After  constructing  a  test  signal  that  separates  outputs  into  disjoint  convex 
sets,  the  author  uses  the  separating  hyperplane  approach  to  determine  which  set  a 
certain  output  falls  into,  and  thus  whether  a  fault  has  occurred.  Linear  programming 
is  used  to  construct  the  separating  hyperplane.  Nikoukhah  et  al.  [29]  has  the  same 
goal  as  [28],  but  goes  about  it  completely  differently.  Among  the  differences,  fault 
isolation  is  accomplished  by  a  ratio  test,  and  optimal  control  theory  is  applied.  This 
paper  is  the  inspiration  for  our  current  research,  and  thus  uses  some  of  the  same 
techniques  we  use,  but  applies  them  to  discrete  time  control  problems.  We  apply 
our  methodology  to  continuous  time  control  problems,  which  present  key  theoretic 
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and  algorithmic  differences.  In  addition,  both  [28]  and  [29]  consider  only  two  model 
systems,  whereas  our  approach  can  handle  problems  with  three  or  more  models. 

The  multi-model  approach  is  also  useful  with  Kalman  filtering.  Keller  et  al.  [21] 
presents  the  multi-model  approach  for  fault  detection  in  stochastic  systems  with  un¬ 
known  inputs.  The  method  uses  the  two-stage  Kalman  filter  with  unknown  inputs 
and  constant  biases,  the  first  stage  of  which  is  bias-free  (for  fault  detection)  and  the 
second  stage  is  a  bias  filter  (for  fault  isolation).  The  optimum  state  estimate  is  ex¬ 
pressed  as  the  output  of  the  bias-free  filter  corrected  with  the  output  of  the  bias  filter. 
Different  fault  types  are  detected  using  a  bank  of  such  filters.  The  two  stages  of  the 
filter  reduce  computational  time  associated  with  the  presence  of  multiple  faults. 

1.2.6  Conclusion 

As  mentioned  in  the  introduction  to  this  chapter,  the  combination  of  the  multi¬ 
model  approach  and  the  bounded  energy  noise  model  seems  to  be  under-explored. 
The  common  thread  running  through  most  of  the  applications  mentioned  in  the  last 
section  is  the  modeling  of  noise.  [16,  21,  22,  23,  33,  37,  41,  42,  43]  model  noise  as 
some  type  of  random  variable.  Many  use  filtering  or  statistical  tests  to  make  the  fault 
decision,  and  thus  do  not  model  noise  at  all.  Only  [17,  27,  28,  29,  36]  model  noise 
as  bounded  energy  signals.  As  we  shall  see,  the  bounded  energy  noise  model  is  very 
suited  to  the  multi-model  approach,  and  the  combination  as  developed  in  this  thesis 
provides  a  powerful  tool  for  fault  detection  and  isolation  in  descriptor  systems. 

1.3  Outline  of  Thesis 

In  the  next  chapter,  we  present  the  theory  and  algorithm  for  the  fault  detection 
phase  of  the  problem,  along  with  extensions  of  the  method  to  variations  of  the  basic 
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problem.  Following  that,  Chapter  3  is  the  development  of  the  algorithm  for  the 
model  identification  phase.  In  Chapter  4,  we  state  the  full  algorithm,  then  present 
and  analyze  several  examples.  Lastly,  Chapter  5  is  the  conclusion  and  outline  of  future 
research  possibilities  in  this  area.  Software  codes  for  the  algorithm  are  in  Appendix 
A. 

1.4  Contributions  of  Thesis 

The  research  in  this  thesis  will  appear,  or  has  already  appeared  in  the  following 
publications: 

•  S.  L.  Campbell,  K.  Horton,  R.  Nikoukhah,  and  F.  Delebecque,  Rapid 
Model  Selection  and  the  SeparaMlity  Index,  in  Proc.  4th  IFAC  Sympo¬ 
sium  on  Fault  Detection,  Supervision  and  Safety  for  Technical  Processes 
(SAFEPROCESS  2000),  Budapest,  Hungary,  June  2000,  pp.  1187-1192. 

•  R.  Nikoukhah,  F.  Delebecque,  S.  L.  Campbell,  and  K.  Horton,  Multi¬ 
model  Identification  and  the  Separability  Index,  in  Proc.  14th  Interna¬ 
tional  Symposium  of  the  Mathematical  Theory  of  Networks  and  Systems 
2000,  Perpignan,  France,  June  2000,  CDROM. 

•  R.  Nikoukhah,  S.  L.  Campbell,  Kirk  Horton,  and  F.  Delebecque,  Auxiliary 
signal  design  for  robust  multi-model  identification,  IEEE  Transactions  on 
Automatic  Control,  accepted  subject  to  final  revision. 

•  S.  L.  Campbell,  Kirk  Horton,  R.  Nikoukhah,  and  F.  Delebecque,  Auxil¬ 
iary  signal  design  for  rapid  multi-model  identification  using  optimization, 
submitted  to  Automatica. 
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Fault  Detection  via  the  Detection  Signal 

2.1  The  Problem  -  Finding  the  Minimum  Energy 
Detection  Signal 

As  introduced  in  the  previous  chapter,  our  goal  is  to  attain  near-perfect  fault  detection 
and  model  identification  in  linear  descriptor  systems  using  the  multi-model  approach. 
This  approach  allows  the  treatment  of  the  problem  in  two  steps.  In  this  chapter,  our 
focus  will  be  on  the  fault  detection  step  of  the  problem,  while  the  next  chapter  will 
tackle  the  model  identification  step. 

Multi-model  fault  detection  and  model  identification  means  that  we  have  two  or 
more  possible  models  for  a  system,  and  we  decide  which  one  corresponds  to  the  system 
based  on  measurements  of  the  inputs  and  outputs  of  the  system  over  a  finite  test 
period,  [0,t/].  While  other  possible  test  periods  exist,  we  will  restrict  our  discussion 
to  the  finite  interval. 

In  order  to  exclude  all  but  one  model  based  on  input-output  measurements,  the 
input  signal  must  have  special  properties.  A  signal  with  such  properties  is  called  a 
proper  detection  signal.  For  the  remainder  of  the  present  discussion  we  will  assume 
that  two  possible  models  exist  for  the  system:  the  nominal,  or  unfailed  model,  and 
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the  fault  model.  This  assumption  is  not  restrictive  in  any  way,  and  later  we  will 
describe  how  the  algorithm  can  be  extended  to  include  the  case  in  whicli  more  than 
one  fault  model  is  present. 

2.1.1  Problem  Setup 

The  true  model  of  the  system  is  one  of  two  models 

x-  =  AiXi  +  BiV  +  Mipi  (2.1a) 

y  =  CiXi  +  NiPi  (2.1b) 

for  z  =  0  and  1,  and  for  t  >  0,  where  xt,  y,  v,  and  pi  are  the  system  states,  output, 
detection  signal,  and  noise,  respectively.  The  matrices  vlj,  Bi,  Ci,  Mi,  and  Ni  are 
matrices  of  appropriate  dimensions.  We  assume  that  v  and  pi  are  in  L‘^[0,tf]  —  , 

forcing  Xi  and  y  to  be  in  as  well.  While  we  assume  full  row  rank  of  the  Mi  and  Ni, 
and  controllability /observability  of  the  system  for  computational  reasons,  there  is  no 
assumption  that  the  dimensions  of  the  state  or  noise  vectors  of  the  two  models  are  the 
same.  We  also  assume  no  a  priori  information  about  the  system  before  t  =  0,  and  in 
particular  no  information  about  a:j(0).  Thus,  unlike  some  existing  theory,  in  particular 
[30],  we  have  no  weights  on  rri(O).  (We  will  disemss  the  impact  of  information  about 
initial  conditions  and  the  subsequent  presence  of  weight  matrices  on  a;i(0)  later  in  this 
chapter.)  In  addition,  we  assume  that  any  feedback  control  has  been  absorbed  into 
the  Ai  matrices  as  described  in  Chapter  1,  or  else  is  nulled  at  t  =  0  for  the  duration  of 
the  test  period.  Thus,  the  only  common  elements  of  the  two  models  are  the  output, 
y,  and  the  detection  signal,  v.  Note  that  (2.1)  is  a  linear  descriptor  system  since  the 
output  y  is  known. 

Consider  the  detection  signal  v  and  let  A^{v)  be  the  set  of  po.ssible  outputs  asso¬ 
ciated  with  this  input  if  Model  0,  the  nominal  model,  is  the  correct  model.  Likewise, 
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let  A^{v)  be  the  set  of  outputs  if  Model  1,  the  fault  model,  is  the  correct  model.  Then 
perfect  model  identification  based  on  output  measurement  implies  that 

=  0.  (2.2) 

This  is  achievable  thanks  to  the  bounded  energy  noise  model.  This  noise  model  can 
be  expressed  as 

SiifXi)  =  \\pif  =  [  \pi{t)f  dt  <1,  i  =  0, 1  (2.3) 

Jo 

where  |  •  |  is  the  (pointwise)  Euclidean  norm,  and  thus  ||  •  ||  is  the  norm.  In  practice 
one  has  bounds  <  K.  It  is  always  possible  to  rescale  the  Mj,  Afj  to  get  K  —  1, 
so  we  assume  without  loss  of  generality  that  K  =  1. 

This  expression  for  the  noise  allows  us  to  distinguish  between  the  two  basic  types 
of  detection  signals. 

Definition  2.1.  The  detection  signal  v  is  not  proper  if  there  exist  Xq,  xi,  po,  Fi, 
and  y  satisfying  (2.1)  and  (2.3).  The  detection  signal  v  is  called  proper  otherwise. 

Thus  we  say  that  the  vector  function  d  is  a  proper  detection  signal  if  its 
application  implies  that  we  are  always  able  to  distinguish  the  two  candidate  models 
based  on  observation  y.  That  is,  condition  (2.2)  is  satisfied  [30].  Note  that  i;  =  0  is 
not  proper  since  the  zero  solution  is  always  in  the  intersection  of  (2.2).  In  addition, 
if  V  is  proper  then  cv  is  also  proper  for  c  >  1,  but  if  v  is  not  proper  then  there  exists 
an  e  >  0  such  that  cv  is  also  not  proper  for  0  <  c  <  1  +  e.  These  facts  will  be  useful 
when  we  develop  the  optimization  problem  later  in  the  chapter. 

The  conditions  for  the  existence  of  proper  detection  signals  are  quite  weak.  For 
their  characterization,  let 

£.(/)= 

Jo 


(2.4) 
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be  the  solution  of  z'  =  AiZ  +  f,  ^(O)  =  0.  Then  the  solutions  to  (2.1)  are 

Xi  =  Ci{Biv)  +  Ci{Mi^Xi)  +  e^%  ■(2.5a) 

y  =  CiCi{B,v)  +  CiCi{M,tJ,)  +  Cie^'^^^  +  Nifli  (2.5b) 

for  i  =  0, 1,  where  is  the  free  initial  condition  for  Xi.  Thus  the  output  set  for  each 
model  is  the  sum  of  three  terms 

•  =  CiCi{Biv)  which  is  a  vector  depending  linearly  on  the  detection 
signal,  V, 

•  {{CiCiMi  +  Ni)pi  :  ll/Till  <  1}  which  is  an  open  convex  set, 

•  €  3?"  (or  C”)}  which  is  a  finite  dimensional  subspace  of  L^. 
Because  of  these  facts,  and  noting  that  tjo  and  are  respectively  the  outputs  of 
Model  0  and  Model  1  corresponding  to  zero  noise  and  zero  initial  state,  we  see  that 
the  output  sets  A^{v)  and  >l^(u)  are  translates  by  yo  and  y\  of  bounded  open  sets. 
Since  t/o  and  y\  depend  linearly  on  v,  either  yo  =  Hi  for  all  u,  or  yo  —  j/i  can  be  made 
arbitrarily  large  with  proper  choice  of  v.  So  proper  detection  signals  exist  provided 
the  linear  mapping  of  v  to  yo  is  distinct  from  the  linear  mapping  of  v  to  yi  [30].  In 
the  time  invariant  case,  this  is  equivalent  to 

Co{sI  -  AoY^Bo  -  C,{sl  -  A.Y^B,  0  (2.6) 


for  some  s. 

The  amount  of  energy  required  for  a  detection  signal  to  be  proper  determines  the 
separability  of  the  output  sets  ./l°(u)  and  ^'(t>). 

Definition  2.2.  Let  V  denote  the  set  of  proper  detection  signals  v.  Then, 

T*=(inf||.||^)"  (2.7) 

is  called  the  separability  index  associated  with  (2.1). 
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Thus,  (7*)“^  is  a  lower  bound  on  the  energy  of  proper  detection  signals.  Also,  the 
inverse  relationship  between  the  separability  index  and  the  proper  detection  signal 
energy  indicates  that  systems  with  lower  energy  proper  detection  signals  have  a  higher 
separability  index.  The  separability  index  is  zero  if  there  are  no  proper  detection 
signals.  Later,  the  algorithm  we  develop  will  compute  7*  as  the  objective  function 
of  an  embedded  optimal  control  problem.  In  Section  2.4.5  we  describe  an  existing 
algorithm  that  computes  7*  [30].  Our  approach  has  the  advantage  of  being  able  to 
address  several  problems  that  the  algorithm  in  [30]  cannot  handle. 

2.1.2  Formulation  as  an  Optimal  Control  Problem 

Before  we  describe  the  algorithm,  however,  the  problem  of  finding  the  minimum 
energy  proper  detection  signal  must  be  formulated  as  an  optimal  control  problem. 
First,  note  that  for  the  detection  signal  v  to  be  not  proper,  (2.1)  must  hold  as  well 
as  (2.3).  We  can  rewrite  (2.3)  as 

maxj^  |/io(t)|^  dt,  ^  \lj,i{t)\‘^  dt^  <  1.  (2.8) 

This  expression  can  also  be  written  as 

max  f  P\po{t)\‘^  +  (I  —  dt  <  1.  (2.9) 

Jo 

Thus  we  obtain  a  useful  characterization  of  not  proper  detection  signals  [30] 

Lemma  2.1.  The  detection  signal  v  is  not  proper  if  and  only  if 

inf  max  f  ^\po{t)\'^  +  (1  —  ^)|yui(t)l^  dt  <  1  (2.10) 

Jo 

where  the  infimum  is  taken  over  {xi,pLi,y)  in  L^,  subject  to  (2.1),  i  =  0, 1. 

This  characterization  is  useful  because  the  algorithm  we  develop  will  compute  the 
minimum  energy  proper  detection  signal  by  finding  the  detection  signal  of  smallest 
norm  that  does  not  satisfy  (2.10). 
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The  next  step  in  formulating  the  computation  of  the  separability  index  as  an 
optimal  control  problem  involves  dimension  reduction.  By  assumption  the  Ni  are 
both  full  row  rank.  Thus,  we  can  perform  a  constant  orthogonal  change  of  coordinates 
on  the  Ni  (via  a  QR  decomposition  on  Nl).  As  a  result  we  obtain 

N  =  [Ni  0]  (2.11) 


where  Ni  is  invertible,  and 


Mi  =  [Mi  Mi^ 


(2.12) 


Let  Pi  =  I  with  the  same  decomposition  as  Ni,  and  subtract  (2.1b)  for  i  =  1 

[HiJ 

from  (2.1b)  for  i  =  0.  Equation  (2.1b)  becomes 

0  =  CqXq  —  CiXi  +  NqPq  —  N  i/ij.  (2.13) 

Now  we  can  solve  for  either  Pi  and  use  the  resulting  expression  to  eliminate  (2.13)  by 
substituting  it  into  (2.1a).  Solving  for  Jiq,  we  obtain 

/-v'  .  - - 

0 


a;. 

V^i 


Ao-MoNn'C'o  MoN~^Ci 


+ 


Mo  MoiVo  ^1  0 

0  Ml  Ml 


+ 


Bo 


V.  (2.14) 


With  the  obvious  correspondences,  the  reduced  system,  no  longer  a  descriptor  system, 
can  be  written  as 


x'  —  Ax  +  Bv  +  Mp. 


(2.15) 


Note  that  we  do  not  require  A  to  be  stable.  An  unstable  A  is  allowable  because  it 
includes  the  case  in  which  the  original  system  is  stable,  the  fault  model  is  unstable. 
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and  we  desire  to  detect  the  fault  in  a  short  test  period  to  prevent  the  instability  of 
the  fault  from  creating  problems  for  the  system. 

The  characterization  of  not  proper,  (2.10),  for  the  reduced  system  becomes 


where 


inf  max  P(x,  u,0)  <  1 
0</3<l 


(2.16) 


P{x,p,  P) 


Nq  CqX{)FNq  CiXiFNq  +  \po\‘^)  + 

{1  -  dt  (2.17) 


and  the  infimum  is  now  taken  over  {x,p)  in  L^,  subject  to  (2.15). 

The  third  step  in  the  transformation  to  an  optimal  control  problem  involves  using 
the  definition  of  the  Euclidean  norm  to  expand  the  integrand.  After  doing  so  and 
combining  like  terms,  we  can  rewrite  (2.17)  as 


where 


1 

P{x,p,,P)  =  -  /  x^ Qx  +  x^ H p  F  pF Rp,  dt 
2  Jo 


Q  =  2p 


C^n/n-'c^  -C^NfN^'Ci 
-O^NgNl'c^  CJn/n-'c, 


(2.18) 


(2.19) 


H  =  AF 


0  -Co^iVo^iVo'^i  0 
0  0 


(2.20) 


R  =  2 


PI 


0 


0 


0  {l-P)I  +  pN'^,NfNo'Ni  0 


(2.21) 


0  0  (1  -  p)i 

Note  that  Q  is  symmetric,  positive  semi-definite  and  R  is  symmetric,  positive  definite. 
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Finally,  letting  Sy  be  the  set  of  functions  {x,  p)  satisfying  the  constraints  (2.15), 
and  defining 


JviP)  =  inf  P{x,p,P) 
{x,^)eSv 


(2.22) 


we  call  on  a  useful  result  [30]. 

Theorem  2.1.  The  function  P  has  a.t  least  one  saddle  point  (a;'*,  /i®,  ,5®)  on  Sy  x 
[0, 1]  and 


inf  maxP(x,u.6)=  min 
{x,fi)eSv  o</3<i 


max  P(x,  u,  6)  = 
o<p<i 


max  min  P{x,  p, /d)  —  P{x^ ,  p\  13^).  (2.23) 

{x,fi)eSv 


Proof  (from  [30])  Let  {x^,p^)  be  the  solution  of  problem  (2.22).  Then  Si{p.i), 
i  =  0, 1,  depend  continuously  on  0  <  ^0  <  1.  Moreover,  since 


<So(/l^)  =  0,  if/3  =  l, 

So{po)  is  continuous  for  15  €  (0, 1],  and  since 

5i(Af)  =  0,  if/?  =  0, 
Si{pi)  is  continuous  for  P  e  [0, 1).  Suppose 

lim<Si(/zf)  >  0. 

Then  for  some  0  <  ,0®  <  1,  we  must  have 

So(p„‘)=Sdg'). 


(2.24) 


(2.25) 


(2.26) 


(2.27) 


Let  (a:®,//®)  =  {x^^^p^").  Then 


P(.x®,p®,^)<cS,(p®),  V^G[0,1], 


(2.28) 
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(holding  at  equality  because  {3  cancels  out)  and 

P{x,p,p^)  >  Siinl),  \/{x,p)  e  Sy  (2.29) 

because  is  the  optimal  solution  of  (2.22)  for  P  =  /5®.  This  implies  that 

{x\p\P^)  is  a  saddle  point  and  the  rest  follows.  Now  suppose  that  (2.26)  does  not 
hold  so  that 

lim5i(/ii)  =  0.  (2.30) 

In  that  case  5o  and  5i  can  be  made  arbitrarily  small  simultaneously.  This  implies 
that  Jv{P)  =  0  for  all  P  which  means  that  there  exists  (rr®,/i*)  such  that  (2.27)  holds 
with  equality  to  zero.  Then,  clearly  (2.28)  holds  because  both  sides  of  the  inequality 
are  zero.  In  addition,  (2.29)  holds  for  all  /3®  £  [0, 1]  because  the  right  hand  side  of 
the  inequality  is  zero  and  the  left  hand  side  cannot  be  negative.  This  implies  that 
(a;*,  p^)  is  a  saddle  point  and  the  rest  follows.  □ 

Note  that  the  above  proof  in  [30]  is  done  with  knowledge  of,  and  weight  matrices 
on  the  initial  state,  a;i(0).  In  that  case,  the  bounded  energy  noise  model  becomes 

Si{xm,pi)=Xi{QfFi,^Xi{p)+  r  \pi{t)\Ut<D  i  =  0,l.  (2.31) 

Jo 

Since  each  Si  is  the  sum  of  positive  semi-definite  terms,  letting  one  term  go  to  zero 
does  not  alter  the  proof. 

This  result  allows  us  to  interchange  the  order  of  the  inf  and  the  max  in  (2.16), 
and  replace  inf  with  min.  Thus 

Jy{p)  =  min  P{x,p,p)  (2.32) 

{XjfXjESv 

and,  the  characterization  of  not  proper  becomes 

max  JyiP)  <  1-  (2.33) 
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Expanding  this  result  to  its  fully  explicit  form,  we  see  that  a  detection  signal  v  is  not 
proper  if  and  only  if 

max  min  -  /  x^Qx  +  xF Hp,  +  /F Rp.  dt  <  I  (2.34) 

o</9<i  2  Jo 

where  the  min  is  subject  to 

x'  =  Ax  +  Bv  +  Mp.  (2.35) 

The  inner  minimization,  the  Jv{P)  problem,  is  a  standard  LQR  optimal  control  prob¬ 
lem  with  an  added  cross  term  in  the  objective  function  and  the  forcing  function  Bv 
in  the  constraint.  Jv{P)  is  called  the  auxiliary  cost  function  for  the  problem. 

The  auxiliary  cost  function  exhibits  several  useful  qualities  [12]. 

Lemma  2.2.  For  all  v  e  L'^ ,  for  0  <  P  <  I,  Jv{P)  is  defined  and  has  the  following 
properties: 

1.  It  is  zero  for  /I  =  0  and  P  =  I, 

2.  It  is  quadratic  in  v,  i.e.,  for  all  scalar  c,  Jcv{P)  =  \c\'^Jv{P), 

3.  It  is  a  continuous  function  of  P, 

4.  If  V  is  not  proper,  then  Jv{P)  <  1  for  all  0  <  P  <  1.  Equivalently,  J^P)  >  1 
for  some  P  implies  v  is  proper, 

5.  It  is  a  strictly  concave  function  of  P  if  the  set  of  proper  detection  signals  is  not 
empty,  otherwise  it  is  identically  zero. 

The  proof  is  straightforward  and  relies  on  continuity  and  linearity.  It  can  be  found 
in  [12].  With  this  result,  we  can  state  the  original  problem  of  finding  a  minimum 
energy  proper  detection  signal  v  as 

min][u||  subject  to  jnax^  J„(/5)  >  1.  (2.36) 
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Note  that  the  cases  /3  =  0  and  ^5  =  1  are  excluded  because  Ji,(0)  =  Jy(l)  =  0,  and 
Lemma  2.2  demonstrates  continuity  of  Jv(/3)  at  these  points. 

Using  the  fact  that  Jv(j8)  is  quadratic  in  v,  we  arrive  at  the  following  fundamental 
result 

Theorem  2.2.  Let 


Then 


J*iP) 


sup 


JvW) 


Cf 

Jo 


=  sup  Jv(/3). 
lHl=i 


(2.37) 


(7*)^  =  max  J*(/3)  (2.38) 

where  7*  is  the  separability  index  defined  previously. 

This  theorem,  while  similar  to  results  in  [29]  and  [30],  has  added  technical  difficul¬ 
ties  due  to  the  presence  of  the  infinite  dimensional  space  of  the  independent  variable 
and  the  unbounded  finite  dimensional  subspace  of  the  output  sets.  Despite  these  dif¬ 
ferences,  the  proof  is  an  extension  of  that  in  [29].  However,  it  is  somewhat  technical 
and  requires  functional  analysis  and  convergence  theory  for  sequences.  See  [12]  for 
the  complete  proof. 

Note  that  the  ease  of  separating  the  nominal  and  fault  models  of  a  system  is  pro¬ 
portional  to  the  size  of  7*.  When  7*  =  0,  the  models  are  indistinguishable  regardless 
of  the  detection  signal. 

As  a  final  result  before  defining  the  optimization  problems  we  will  address,  we 
state  a  useful  corollary  to  Lemma  2.2. 

Corollary  2.1.  A  detection  signal  v  is  proper  if  and  only  if  Jy{fi)  >  1  for  some 
0<I3<1. 


Proof  Lemma  2.2,  part  4,  shows  that  v  is  proper  if  Jy{fi)  >  1  for  some  fi.  To  show 
the  converse  suppose  that  Jv{fi)  <  1  for  all  fi.  For  each  fi,  let  {po{P),  pi{P)}  be 
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where  Jv(/3)  attains  its  minimum.  Clearly,  the  values  producing  a  minimum  at  each 
^  endpoint  are  /ii  =  0  for  ^  =  0,  and  //q  =  0  for  =  1.  Thus  there  will  be  a  value  /? 
where  ||/Uo(^)||  =  ||//i(^)||.  But  then  ||/ii()0)||  <  1,  which  shows  that  v  is  not  proper. 
□ 

2.1.3  Problem  Statement 

We  can  now  state  the  two  versions  of  the  problem  solved  by  the  first  half  of  the 
algorithm.  Version  One,  from  (2.37-2.38)  is; 

(7*)2  =  max  Jv(j3).  (2.39) 

INI  =  1 

0  <  /3  <  1 

Version  Two,  from  (2.7)  and  (2.36)  is: 

(7*)-"  =  min  /  \v\^dt.  (2.40) 

Jo 

M0)  >  1 
0  <  <  1 

These  problems  will  be  solved  by  first  calculating  the  necessary  conditions  for  a 
minimum  of  the  inner  problem  which  defines  Ju(/5),  then  numerically  solving  the 
outer  problem  using  the  previously  computed  necessary  conditions  as  constraints. 


2.2  Necessary  Conditions 

As  with  many  types  of  optimization  problems,  the  Jv{P)  problem  posisesses  conditions 
that  any  extrema  must  satisfy  in  order  to  be  an  optimal  solution.  In  Chapter  1  we 
introduced  the  necessary  conditions  for  an  optimal  solution  to  the  standard  LQR 
problem.  The  Jv{/J)  problem,  while  similar,  is  not  the  same  problem  as  that  discussed 
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in  Chapter  1  because  of  the  presence  of  the  cross  term  in  the  integral,  so  in  this  section 
we  develop  the  necessary  conditions  for  the  Jv(^)  problem  explicitly. 


2.2.1  Computing  the  Necessary  Conditions 

Recall  from  (2.32)  that 

1  Cf 


1 

Jy[P)  =  mm-  /  x^Qx  +  x^Hp,  + p^Rp  dt 
2  Jo 


subject  to 


x'  —  Ax  +  Bv  +  Mp. 


(2.41a) 


(2.41b) 


The  Hamiltonian  for  system  (2.41)  is 

H  =  -x^Qx  +  ^x^Hp  +  ^pFRp  +  {Ax  +  Bv  +  Mp).  (2.42) 

2  2  2 

As  described  in  Chapter  1,  the  Euler  equations  for  an  extremum  are 

Hj  =  x'  (2.43a) 

-Hi  =  A'  (2.43b) 

Hi  =  0.  (2.43c) 

These  conditions  applied  to  (2.42)  give 

x'  =  Ax  +  Bv  +  Mp  (2.44a) 

A  =  -Qx-\hp-A^X  (2.44b) 

0  =  Rp+]-H^x  +  M^X.  (2.44c) 

which  is  an  index  one  DAE  in  {x,  X,  p)  since  J?  >  0. 

While  our  algorithm  will  use  (2.44)  in  their  current  form,  it  will  be  beneficial  to 
define  these  conditions  in  terms  of  a  matrix  Riccati  equation  as  well.  Riccati  equations 
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stabilize  relatively  quickly,  and  are  thus  useful  in  theoretical  developments  involving- 
long  time  intervals.  The  derivation  also  lends  insight  into  boundary  conditions  for 
(2.44).  In  addition,  Riccati  equations  are  used  extensively  in  [30,  31]. 

2.2.2  Riccati  Form  of  Necessary  Conditions 

To  begin,  note  that  the  form  of  (2.41a)  is  a  particular  case  of  a  more  general  problem 

Z  =  min  -x{t Fx{t f)  +  -  [  x^ Qx  +  x'^Hfi  +  jF Rp,  dt  (2.45a) 

2  2  ./o 

subject  to 


x'  =  Ax  +  Bv  +  Mp  (2.45b) 

■where  F  =  251,  (5  >  0  and  small),  is  symmetric  positive  semi-definite.  The  Jv{P) 
problem  is  the  case  in  which  F  =  0,  but  for  the  following  derivation  we  leave  the  F 
term  in  the  cost.  A  nonzero  F  matrix  is  also  used  in  Section  2.4.5.  Fixing  the  initial 
state,  a;(0)  =  and  leaving  x{tf)  free,  we  minimize  over  all  possible  initial  conditions 
to  obtain  an  expression  for  the  optimal  cost  which  will  be  useful  in  one  form  of  our 
algorithm. 


Optimal  Trajectory 


Noting  that  extrema  of  (2.45)  must  also  satisfy  conditions  (2.44)  to  be  optimal,  and 
that  R  is  symmetric  positive  definite  for  0  <  ^  <  1,  we  solve  for  //,  in  (2.44c)  and 
substitute  into  (2.44a)  and  (2.44b)  to  obtain  a  system  in  (.x.  A) 


A-^MR-^H^ 
\HR-^H^  -  Q 


-MR-^M^ 

-  A^' 


V. 
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Letting 

S  = 

W  =  (symmetric,  >  0) 

V  =  MR~^M'^  (symmetric,  >  0) 

we  can  simplify  the  system  to 


X 

A 


A-S  -y 
W-Q  S'^-A'^ 


(A 

B 

+ 

J  Iv 

0 

(2.46) 


a  system  of  2n  linear  time  invariant  differential  equations,  with  the  vector 

acting  as  the  forcing  function.  The  system  has  the  following  boundary  condi 

•  At  t  =  0,  n  boundary  conditions  are  provided  by  the  initial  conditions: 
a;(0)  =  f 

•  At  t  =  tf,  n  boundary  conditions  are  provided  by  the  transversality  con¬ 


B 

0 

ions 


ditions: 


mr = 


d 


dx{tf) 


^x{tffFx{tf) 


Thus 


(2.47) 


To  determine  the  form  of  the  relationship  between  x  and  A,  let  D{t]0)  be  the 
2n  X  2n  fundamental  solution  matrix  for  (2.46).  Then 

r 

I  +  /  fi-(r;0)  ^ 

\{t)j  1  \ml  ■'»  0 

Shifting  this  equation  to  the  right  end  of  the  interval,  we  obtain 


Jo 


V  dr 


X{tf) 


x{t)  ^  ^  Cf 

At) 


Cf 

J  ^  {r]t) 


B 

0 


V  dr 
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Partitioning  D(tf‘,t)  into  four  n  x  n  blocks 


fl(tf;t) 


D2\{tf\t)  D22{tf]t) 


leads  to 

-  /■*/ 

^(^/)  “  d”  ^  ^  (T',t')Bv  dr  (2.48a) 

X{tf)  —  D2ix(t)  +  f222-^(^)  (2.48b) 

where  D  and  D  are  the  parts  of  D  and  0“^  which  conform  to  x{t).  Combining 
(2.47),  (2.48a),  and  (2.48b) 

/tf  ^  —  1 

n  (t;  t)BvdT 


leads  to  an  expression  for  X{t) 


X{t)  =  [O22(^/;t)-Ff2i2(t/;t)]"M^^ii(^/;0-^^2i(f/;i)].T(t) 
[D22{tf]t')  —  FDi2{tf',t)]  FD(tf',t)  D  (^T',t)Dv  dr 


(2.49) 


provided  the  indicated  inverse  exists.  Note  that  at  t  =  tf,  we  know  that  f2(t/;  tf)  =  I, 
i.e.. 


^n{tf',  tf)  —  D22{tf',  tf)  —  I 
^u{tf',  tf)  =  f^21  {tf',  tf)  =  0 


Thus  D22{tf',tf)- FDi2{tf',tf)  =  /  is  nonsingnlar.  Also  FQ.u{tf,tf)  -i'l2\{tf,tf)  =  F 
so  that 


X{tf)  =  IFx{tf)  +  IF  10  =  Fx{tf) 

in  agreement  with  (2.47).  Kalman,  et  al.,  [20]  have  shown  that  the  inverse  exists  for 
all  to  <  t  <  tf,  so  (2.49)  is  valid. 
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The  form  of  (2.49)  leads  us  to  believe  that  x(t)  and  A(t)  are  related  by 

A(t)  =  K(t)x(t)  -  g(t).  (2.50) 

From  (2.46),  we  know  that 

x'  =  (A-  S)x  -  VA  +  Bv. 

Combining  this  equation  with  (2.50)  results  in  the  expression  of  the  optimal  trajectory 

x'  =  {A- S-  VK)x  +  Vg  +  Bv.  (2.51) 

Riccati  Equation 

Differentiating  (2.50),  we  obtain 

A'  =  K'x  +  Kx'  -  g'.  (2.52) 

Substituting  (2.51)  into  (2.52),  we  obtain 

A'  =  [K'  +  K{A  -S)-  KVK]x  +  KVg  +  KBv  -  g'.  (2.53) 

From  (2.46),  we  also  know  that 

A'  =  (IF  -  Q)x  +  -  A^)X. 

Combining  this  equation  with  (2.50)  results  in 

A'  =  [{S'^  -  A^)K  +  W-Q]x  +  {A^  -  S^)g.  (2.54) 

As  long  as  an  optimal  solution  exists,  (2.53)  and  (2.54)  must  hold  for  all  x  and  t. 
Equating  coefficients  yields 


K'  =  {S'^  -  A^)K  +  K{S  -A)+  KVK  +  W  -Q 


(2.55) 
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and 


g' =  (S^ -A^ +  KV)g  +  KBv.  (2.56) 

The  boundary  conditions  for  these  equations  are  obtained  at  t  —  tj  as  follows: 

•  From  (2.50),  \{tf)  =  K{tf)x{tf)  -  g{tf), 

•  From  (2.47),  X{tf)  =  Fx{tf). 

Both  of  these  must  hold  for  all  x{tf),  so 

K{tf)  =  F  [in  agreement  with  the  note  below  (2.49)] 
g{tf)  =  0. 

With  the  boundary  conditions  completely  specified  at  t  =  tj.  (2.55)  and  (2.56) 
can  be  solved  to  obtain  K{t)  and  g(t)  uniquely  for  all  t  G  [0,  i/j.  Furthermore,  if  K[t) 
is  the  solution  of  (2.55),  and  K{tj)  =  F,  then  K{t)  is  symmetric  for  all  t  e  [0,t/], 
because 

{K'f  =  [(5^  -  A'^)K  +  KiS  -  A)  +  KVK  +  W-Qf 

and  thus 

{K'^y  =  K^{S  -  /I)  +  {S^  -  A'^)K'^  +  K'^VK^  +  W  -Q 

since  W  and  Q  are  symmetric,  which  means  that  K  and  are  solutions  of  the  same 
Riccati  equation.  This  fact,  combined  with  the  boundary  condition  K{tf)  —  F  = 
K{tfY' ,  and  the  uniqueness  property  of  ODE  solutions,  implies  that  K  =  K'^. 

In  fact,  K  is  also  positive  definite  and  bounded  for  all  t  G  [0, t/j.  To  see  the.se 
facts,  we  must  first  show  W  —  Q  to  be  negative  definite.  Recall  the  construction  of 
the  M  matrix 

Mo  MoNo^Ni  0 

0  Ml  Ml 


M  = 
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where 

Mi  =  [  Mi  Mi  ]  ,  A/i  =  [  iVi  0  . 

We  have  assumed  that  each  Mi  is  full  row  rank,  and  that  the  coordinate  change 
is  done  such  that  Ni  is  invertible.  This  implies  that  M  is  full  row  rank,  and  thus 
V  =  is  symmetric  positive  definite,  since  R  is  symmetric  positive  definite. 

Letting  C  =  -'Nq^Ci  ,  we  can  rewrite  Q  and  H  as 

Q  =  2/5C'^C 

H  =  -A/3C'^No'[o  Ni  0  • 

Thus 

W  =  IHR-^H'^ 

=  l{16p^)C'^No'[o  iVi  o]i?-'[o  Ni  oj^Nfc 

=  2^^C^N~'Ni[{l-P)I  +  pN'^iNo'^No'Niy  N'^i'N/c. 
Performing  a  singular  value  decomposition  on  Nq  ^  Ni ,  we  obtain 

ai  0  0 

0  0 
0  0  (Tm 

where,  since  {Nq^Ni)~^  exists,  all  cTj  are  bounded  away  from  zero.  Then 

W  =  2P^C'^U'Lv\{l-p)I  +  ^{UDVf{UDV)y\uDV)'^C 

=  2^‘^C'^UD  [(1  -  /5)/  +  PdA  DU^C 

0 

0  u^c 


Nq^Ni  =  UEV  =  U 
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and 


Q  =  2pC'^C  =  2pC'^UU^C. 


Subtracting,  we  obtain 


W-Q  =  2C^U 


=  2C^f/ 


_ R 

(1-/3)+/3<t?  ^ 


U^C  -  2C'^UPIU'^C 


U'^C. 


At  ^  =  1,  =  0>  and  at  /?  =  0,  =  0.  This  expression  is 

quadratic  in  P,  and  so  it  can  have  at  most  two  distinct  roots.  Thus,  for  values  of  /? 
between  0  and  1,  it  will  either  be  positive,  or  negative  (or  identically  zero,  which  it 
is  not),  but  not  a  combination  of  both.  Evaluation  at  the  midpoint  of  the  interval 
shows  that  the  expression  is  always  negative.  This  fact  implies  that  IT  —  Q  is  negative 
definite. 

Now  in  solving  the  Riccati  equation  (2.55)  in  backward  time 


K'  =  (5^  -  A^)K  +  K{S  -  A) KVK  +  W  -Q 
at  t  =  t/,  we  know  that  K{tf)  =  F  —  251,  where  6  is  nonnegative  and  small.  Thus 
K'{tf)  =  {smalt)  +  (smalt)  +  (smalt)'^  +  (negaMve  definite)  =  (negative  definite) 


and  thus 


K(tf  -  At)  >  0. 

If  5  =  0,  the  same  result  holds.  Thus  we  see  that  if  F  is  symmetric  positive  semi- 
definite,  then  K(t)  is  symmetric  positive  definite  at  t  =  tj  —  At.  Either  K(t)  remains 
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positive  definite  in  backward  time  or  it  does  not.  If  it  remains  positive  definite,  then 
at  those  times,  r,  when  it  begins  to  lose  definiteness,  (r— At)x  >  0  for  all  nonzero 

vectors  x-  If  K{t)  does  not  remain  positive  definite,  then  at  those  times,  r,  when 
K{t)  loses  definiteness  there  must  exist  a  nonzero  vector  X)  for  which  K{t)x  =  0.  If 
this  is  true, 

X^K'{t)x  =  x^{S^  -  A^)K{r)x  +  X^K{T){S  -  A)x  + 

X^K{T)VK{T)x  +  X^i^  -  Q)X 

=  0  +  0  +  0  +  {negative  definite). 

That  is,  ')^K'{t)x  <  0,  which  implies  that  x^K{t  -  At)x  >  0,  and  K{r  -  At) 
becomes  positive  definite  again.  Since  this  true  for  any  vector  that  causes  loss  of 
positive  definiteness  as  well  as  those  vectors  that  do  not,  it  is  true  for  all  vectors. 
Thus  K{t)  remains  positive  definite. 

Furthermore,  at  t  =  r  +  At,  for  any  nonzero  vector  x,  for  which  K{t)x  =  0, 

X^K'i'r  +  At)x  =  {small)  +  {small)  +  {smallY  +  {negative  definite) 

which  implies  x^K'{T+At)x  <  0.  By  the  same  argument  then,  K{t)  cannot  continue 
to  lose  definiteness,  and  is  therefore  bounded  below,  away  from  zero. 

To  see  that  K  is  bounded  above,  multiply  (2.55)  by  K~^  on  both  sides. 

K-^K'K-^  =  K-\S^  -  A^)  +  {S-  A)K-^  +  V  +  K-\W-  Q)K-\ 

Since  {K-^y  = 

{K-^y  =  K-\A^  -  5^)  +  {A-  S)K-^  +  K-'^{Q  -  W)K-^  +  {-V) 

which,  since  Q  -W  >  0  and  {-V)  <  0,  is  the  same  Riccati  equation  as  (2.55),  with 
K~^  as  the  solution.  So,  by  the  same  argument  as  that  above  for  K{t),  K{t)~^  is 
bounded  away  from  zero.  This  fact  implies  that  K {t)  is  bounded  above  as  well.  Thus, 
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if  F  is  symmetric  positive  semi-definite,  then  K(t)  is  the  symmetric  positive  definite 
solution  to  a  matrix  Riccati  equation,  bounded  above  for  all  t  G  [0,t/). 


Optimal  Cost 

Returning  to  the  cost  function,  suppose  the  optimal  cost-to-go  at  time  t  has  the  form 

Z’iP,  t)  =  \x(tf  K(t)x(t)  -  g{tfx{t)  +  <p(t).  (2.57) 

To  obtain  an  expression  for  ip{t),  define  the  Hamilton- Jacobi  equation  for  the  sj^stem 
as 


§lZ’(P,  t)  +  min,.  [^x’'Qx  +  ^x^H^l  +  \iFRii 

Letting  X{t)  =  ■^Z*{P,t)  be  the  Lagrange  multiplier  as  before,  the  p,  that  mini¬ 
mizes  the  expression  in  brackets  is  given  by  (2.44c).  Substituting  for  p  in  (2.58)  and 
expanding,  we  obtain 

+  -x'^iQ  -  W)x  +  x'^iA^  h'^VX  4-  v'^BX  =  0  (2.59) 

ot  2  2 

where  S,  W,  and  V  are  as  before.  Now,  if  (2.57)  is  correct,  then 

Z*  =  -x^Kx  —  g^x  +  g>. 


Thus 

^Z*  =  \x'^K'x  -  g''^x  +  A 
dt  2 

and 

^Z*  =  X  =  Kx  -  g. 
dx 
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Substituting  these  into  (2.59),  then  expanding  and  combining  like  terms  we  obtain 

[K'  +  Q-W  +  -  S^)K  +  K(A  -  S)  -  KVK]  x 

+x^  [-o' +  {S^  -  A'^ +  KV)g  +  KBv]  (2.60) 

Ag^'-y'^Vg-v^B^g  =  0. 

For  (2.60)  to  hold  for  all  x  and  t,  (2.55)  and  (2.56)  must  hold,  and 

ip'  =  \g^Vg  +  v^B^g  (2.61) 

must  also  hold.  Boundary  conditions  for  ip,  obtained  from  (2.57)  att  =  tf,  are 

Z*W,tf)  =  ^x{tffK{tf)x{tf)  -  g{tffx{tf)  +  ip{tf). 

Since  K{tf)  =  F,  and  g{tf)  =  0,  this  expression  reduces  to  the  terminal  cost, 
\x{tjfFx{tf),  i^g?{tf)  =  0. 

To  obtain  an  expression  for  the  total  optimal  cost,  first  note  that  the  total  cost 
for  the  fixed  initial  condition  problem  is 

0)  =  -  9(0)"'e  +  »)(0). 

The  solution  for  the  free  initial  state  problem  is  obtained  by  letting  ^  vary,  and 
minimizing  Z*  over  the  initial  states.  We  find  the  necessary  condition  for  a  minimum 
is 

Fz‘  =  fK{0)  -  g{0f  =  0. 

Note  that  from  (2.50) 

A(0)^  =  x(0)^X(0)  -  g{0f  =  eK{0)  -  g{0)'^. 

Thus  the  optimal  Z*  for  the  free-endpoint  problem  occurs  when  A(0)  =  0 

z’iP)  =  |e’'^r{0)^-l;(0f^  +  v’(0) 

=  »)(0)  -  i3:{0)^A-(0)i(0) 
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and  since 


A(0)  =  K(0)x(0)  -  <7(0)  =  0  (2.62) 

then  /ir(0)a:(0)  =  ^(0),  and  we  obtain  the  expression  for  the  optimal  cost 

Z‘{/?)  =  V>(0)-^x(0fs(0).  (2.63) 


Summary 

The  necessary  conditions  for  an  optimal  solution  to  the  Z  problem,  (2.45),  can  be 
represented  in  terms  of  a  stable  matrix  Riccati  differential  equation  and  two  linear 
vector  differential  equations.  The  above  derivation  is  completely  consistent  with  the 
case  F  =  0.  Thus,  the  Jv(jS)  problem  can  be  characterized  in  terms  of  those  derived 
quantities.  In  the  next  section,  we  will  formulate  the  max  min  problem  in  terms  of 
the  two  forms  of  the  necessary  conditions  derived  above. 

2.2.3  Problem  Formulation  in  Terms  of  the  Necessary  Con¬ 
ditions 

The  previous  discussion  can  be  summarized  by  expressing  (2.39)  and  (2.40)  as  bound¬ 
ary  value  problems  (BVPs)  in  terms  of  the  two  forms  of  the  necessary  conditions:  the 
raw  form,  (2.44),  and  the  Riccati  form,  (2.55),  (2.56),  (2.51),  (2.61),  and  (2.62). 
Using  the  raw  form  of  the  necessary  conditions.  Version  One,  (2.39),  becomes 

(7*)^  =  TnaxZ(tj-) 

V,j0 


(2.64a) 
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subject  to  the  constraints 


x' 

=  Ax  +  Bv  +  Mp 

(2.64b) 

A' 

=  -Qx-l-Hp-A^X,  A(0)  =  A(t/)  =  0 

(2.64c) 

e' 

=  0(0)  =  0,  9{tj)  —  1 

(2.64d) 

Z' 

=  ^  \x^Qx  +  x^Hp  +  p^Rp]  ,  Z(0)  =  0 

(2.64e) 

0 

=  +  M^A 

(2.64f) 

0  <  ^  <  1. 

(2.64g) 

Equations  (2.64)  are  an  index  one  BVP  in  (x,X,9,Z,p)  since  R  >  0.  Version  Two, 


(2.40),  becomes 

(7*)-2  =  min  r  \\vfdt  (2.65a) 

Jo 

subject  to  the  constraints 

x'  =  Ax  +  Bv  +  Mp  (2.65b) 

A'  =  -Qx-^Hp- A^X,  A(0)  =  A(t/)  =  0  (2.65c) 

Z’  =  l-[x^Qx  +  x'^Hp  +  p'^Rp],  Z(0)  =  0,  Z{tf)>l  (2.65d) 

0  =  Rp+^H^x  +  M^X  (2.65e) 

0  <  ^  <  1.  (2.65f) 

Using  the  Riccati  form  of  the  necessary  conditions,  Version  One,  (2.39),  becomes 

(7*)^  ==  max  7)(0)  -  ia:(0)^p(0)  (2.66a) 

v,l3  I  Z 
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subject  to  the  constraints 

K'  =  (S'^-A^)K  +  KiS-A)-hKVK-\-W-Q,  K{tf)  =  0  (2.66b) 

g'  =  {S^  -  A^ +  KV)g  +  KBv,  g{tf)^0  (2.66c) 

x'  =  {A-  S  -VK)x  +  Vg  +  Bv  (2.66d) 

=  ^g'^Vg  +  g^Bv,  <p{tf)  ^  0  (2.66e) 

O'  =  v'^v,  0(0)  =  0,  0{tf)  —  1  (2.66f) 

0  =  .ft:(0)a;(0)  -  5(0)  (2.66g) 

0  <  ^  <  1.  (2.66h) 

Version  Two,  (2.40),  becomes 

(7*)“^  =  min  /  (2.67a) 

Jo 

subject  to  the  constraints 

K'  =  {S^-A'^)K  +  K{S-A)  +  KVK  +  W-Q,  K{tf)  =  0  (2.67b) 

g'  =  {S'^-A'^  +  KV)g  +  KBv,  g{tf)  =  0  (2.67c) 

x'  =  {A-  S  -VK)x  +  Vg  +  Bv  (2.67d) 

(p'  =  ^g'^Vg  +  g'^Bv,  ip{tf)  =  0  (2.67e) 

0  =  i^(0)x(0)  -  5(0)  (2.67f) 

1  <  (,^(0)  -  ^a;(0)'^5(0)  (2.67g) 

0  <  ^  <  1.  (2.67h) 

Equations  (2.64e)  and  (2.65d)  define  J„(/5)  explicitly,  while  (2.67g)  and  the  right  hand 
side  of  (2.66a)  define  it  in  terms  of  (2.63).  Equations  (2.65d)  and  (2.67g)  also  reflect 
the  requirement  for  v  to  be  proper,  Jv{P)  >  1-  Note  that  both  formulations  of  Version 
One,  (2.64)  and  (2.66),  require  the  detection  signal  to  be  normalized.  Thus,  for  these 
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problems,  the  minimum  energy  proper  detection  signal  must  be  rescaled  by  No 
rescaling  is  necessary  for  the  formulations  of  Version  Two,  (2.65)  and  (2.67). 


2.2.4  Sufficient  Conditions 


The  arguments  of  the  two  previous  sections  relied  on  the  assumption  that  the  nec¬ 
essary  conditions  guaranteed  a  minimum.  Sufficient  conditions  for  a  local  minimum 
are  in  fact  included  in  the  necessary  conditions  as  well.  To  see  this  fact,  recall  that 
the  Hamiltonian  for  our  problem  is 

H  =  -x^Qx  +  -x^Hp  +  +  X^iAx  +  Bv  +  Mp).  (2.68) 

2  2  2 

It  is  a  well  known  fact  (Athans  and  Falb  [1],  for  example),  that  if  the  matrix 

d^S  d^H 
dx'^  dx  dfjL 

dfi  dx 

is  positive  semi-definite,  then  the  noise,  p,  which  satisfies 

H,  =  0  (2.70) 


(2.69) 


(one  of  the  Euler  equations)  is  at 

dH 
dx 
d‘^H 
dx"^ 
dH 
dp 
d^H 
dp^ 
d^H 
dx  dp 
d'^H 


least  locally  optimal.  Prom  (2.68),  we  find  that 


Qx  -h  \.Hp  -h 

(2.71a) 

Q 

(2.71b) 

Rp  +  ^H^x  +  M^X 

(2.71c) 

R 

(2.71d) 

(2.71e) 

2 

(2.71f) 

dp  dx 
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Substituting  (2.71)  into  (2.69),  we  obtain 


Q  Iff 
iff^  ff 


(2.72) 


which  is  equivalent  to  G^G,  a  symmetric  positive  semi-definite  matrix,  where 


'  y/m^o'Go 

-^ywNo'ci 

0 

0 

0 

0 

0 

0  0 

G  = 

0 

0 

VWi 

0  0 

0 

0 

0 

v/2(l  -  P)I  0 

0 

0 

0 

0  \/2(l  -  P)J 

(2.73) 


Thus  (2.72)  is  positive  semi-definite.  Since  the  higher  derivatives  of  N  are  zero,  the 
noise,  /r,  obtained  from  (2.70)  and  (2.71c)  minimizes,  at  least  locally,  the  cost  [1]. 


2.3  The  Minimum  Energy  Detection  Signal  Algo¬ 
rithm 

It  is  useful  at  this  point  to  put  the  detection  signal  problem  in  algorithm  form. 
The  algorithm  described  below  is  suitable  for  solving  in  Boeing’s  Sparse  Optimal 
Control  Software  (SOCS)  [3,  4].  The  SOCS  package  is  a  collection  of  FORTRAN 
77  subroutines  capable  of  solving  a  great  variety  of  optimal  control  problems.  The 
package  can  handle  many  types  of  constraints,  including  and  algebraic  constraints, 
which  are  the  types  of  constraints  generated  by  the  problems  considered  in  this  thesis. 
The  main  drawback  of  the  software  is  that  it  is  written  in  FORTRAN  77,  which  has 
limited  matrix  capability,  and  therefore  requires  translation  of  all  system  parameters 
into  indexed  variable  form.  This  can  be  overcome  by  the  use  of  additional  software 
which  can  automatically  generate  FORTRAN  code  from  matrix  formatted  input.  In 
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Appendix  A,  we  describe  routines  in  MATLAB,  by  The  MathWorks,  Inc.  [26]  and 
MAPLE,  by  Waterloo  Maple,  Inc.  [18]  which  can  be  used  to  accomplish  the  required 
translation,  as  well  as  a  sample  SOCS  driver  file. 

The  algorithm  requires  the  system  model  to  be  in  the  form  previously  described 

x'i  =  AiXi  +  BiV  +  MiPi  (2.74a) 

y  =  CiXi  +  NiPi  (2.74b) 

for  i  =  0  and  1,  where  Xi,  y,  v,  and  pi  are  the  system  states,  output,  detection  signal, 
and  noise,  respectively.  The  matrices  Aj,  Bi,  Ci,  Mi,  and  iVj  must  be  matrices  of 
appropriate  dimensions,  with  Mj  and  Ni  having  full  row  rank.  In  addition,  [Ai  Q] 
must  be  observable. 

The  minimum  energy  detection  signal  (MEDS)  algorithm  (with  appropriate  soft¬ 
ware  in  parentheses): 

1.  Perform  QR  decomposition  on  Nf  (MATLAB) 

Nj'  =  QiRi  (2.75) 

where  the  Qi  are  unitary  matrices.  Thus 

Ni  =  RJQJ  (2.76) 


2.  Perform  constant  orthogonal  coordinate  changes  on  pi  (MATLAB) 

(a)  Let  RJ  =  [Ni  0] ,  where  Ni  is  invertible 

(b)  Let  QJ Pi  =1^1  with  the  same  partitioning  as  Rj.  Thus 

\pi 


NiPi  =  [Ni  0] 


Fi 

.Pi 


(2.77) 
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(c)  Let  =  [Mi  Mi]  with  the  same  partitioning  as  Rj.  Thus 


MiPi  =  MiQi  '^Qjpi  =  [Mi  Mi 


.hi  , 


(2.78) 


Note  the  system  model,  (2.74),  becomes 


x\  =  AiXi  +  BiV  +  MiPi  +  MiPi  (2.79a) 

y  =  CiXi  +  NiPi  (2.79b) 


3.  Reduce  model  dimension  by  eliminating  y  and  p^  (MATLAB) 

(a)  Combine  both  equations  (2.79b)  for  y,  solve  for  Pq,  and  substitute  into 
(2.79a),  i  =  0 

Xq  =  (Aq  —  MqNq  Co)xo  +  MqNq  C'l.Ti  +  BqV  +  MoPq  +  MqNq  Nip,i 

(2.80) 


(b)  Let 


I  00o\ 

Ao-MoN~'Co  MoN~^C, 

,  B  = 

B, 

X  ^  \ 

,  A  = 

1 

o 

_ 1 

.  . 

Mo  MoNo'Ni 

1 

0 

0  Ml 

Ml 

,  h  = 

Ml 

(m./ 

Note  the  system  model,  (2.79),  is  now 

x'  =  Ax  +  Bv  +  Mp 


(2.81) 


4.  Compute  new  system  matrices  (MATLAB) 
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(a)  Let  C  =  [A^o -  Nq  ^Ci]  and  N  —  Nq^[0  Ni  0],  with  the  columns  of 
N  conforming  to  p 

(b)  Let  H  =  and  Q  =  2^C'^C 

(c)  Let 


R  =  2 


131  0  0 

0  (i-^)j  +  ^ivfiVo^]Vo'iVi  0 

0  0  (1-/0)^ 


with  its  rows  and  columns  conforming  to  p 

(d)  Compute  A,  B,  M,  H,  Q,  and  R 

(e)  For  Riccati  version,  compute  S  =  ,  W  —  ^HR~^H^,  and  V  = 

MR-^M^ 


5.  Perform  constrained  optimization  (SOCS) 

(a)  Version  One,  solve 

=  m&xZ{tf)  (2.82a) 

v,0 

subject  to  the  constraints 

x'  =  Ax  +  Bv  +  Mp  (2.82b) 

A'  =  -Qx-^Hp-A'^X,  X{0)  =  X{tf)=0  (2.82c) 

&  =  v^v,  ^(0)=0,  ^(^/)  =  1  (2.82d) 

Z'  =  -  [x'^Qx  +  x'^Hp  +  p'^Rp]  ,  Z{0)  =  0  (2.82e) 

0  =  Rp  +  ^H^x  +  M'^X  (2.82f) 

Li 

0.01  <  yd  <  0.99  (2.82g) 
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(b)  Version  Two,  solve 

(7*)''^  =  min  /  (2.83a) 

Jo 

subject  to  the  constraints 

x'  =  Ax  +  +  Mp  (2.83b) 

A'  =  -Qx-^Hp-  A^X,  X{0)  =  X{tf)  =  0  (2.83c) 

Z'  =  ^  [x^Qx  +  x^Hp  +  p'^Rp]  ,  Z{0}  =  0,  Z{tf)  >  1  (2.83d) 

0  =  Rp+^H'^x  +  M'^X  (2.83e) 

0.01  <  ^  <  0.99  (2.83f) 

(c)  Version  One,  the  Riccati  form,  solve 

(7*)2  =  max  (p{0)  -  l:x{0)'^g{0)  (2.84a) 

u,/?  L  z 

subject  to  the  constraints 

K'  =  {S'^ -A^)K  +  K{S-A)  +  KVK  +  W -Q  (2.84b) 

g'  =  {S'^  -  A^  ^KV)g^KBv,  g{tj)=0,  /•s:(^/)  =  0  (2.84c) 

x'  =  {A  -  S  -VK)x  +  Vg  +  Bv  (2.84d) 

A  =  ^g^yg-h  g'^Bv,  (/3(t/)  =  o  (2.84c) 

9'  =  v'^v,  0(0)  =  0,  9{tf)  =  l  (2.84f) 

0  =  /t'(0)a;(0)-^(0)  (2.84g) 

0.01  <p<  0.99  (2.84h) 

(d)  Version  Two,  the  Riccati  form,  solve 

r*-j 

(7*)“^  =  min  /  (2.85a) 

Jo 
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subject  to  the  constraints 


K' 

=  {S^  -  V)JT  +  K{S  -  A)  +  KVK  +  W-Q 

(2.85b) 

9' 

=  {S^-A'^  +  KV)g  +  KBv,  g{tf)  =  0,  K{tf) 

=  0  (2.85c) 

x' 

=  {A-S-VK)x  +  Vg  +  Bv 

(2.85d) 

=  ^9'^Vg  +  g'^Bv,  (p{tf)=0 

(2.85e) 

0 

=  Ki0)x{0)-g{0) 

(2.85f) 

1 

<  (^(0)  -  ^a:(0)'^^(0) 

(2.85g) 

0.01  <  /3  <  0.99 

(2.85h) 

For  each  formulation,  v  and  p,  are  treated  as  control  variables  while  /5  is  passed  to 
SOCS  as  a  parameter.  The  constraint  (2.85h)  and  the  others  identical  to  it  are 
required  to  bound  (3  away  from  0  and  1  early  in  the  optimization  process.  The 
minimum  energy  proper  detection  signal,  v,  is  for  (2.82)  and  (2.84).  The  minimum 
energy  proper  detection  signal,  v,  is  simply  v  for  (2.83)  and  (2.85). 

2.4  Variations 

The  assumptions  and  problem  formulations  described  above  limit  the  types  of  prob¬ 
lems  that  may  be  solved  by  the  MEDS  algorithm.  This  fact  leads  to  questions  about 
extending  the  algorithm  to  variations  of  the  problem:  Can  the  algorithm  handle  prob¬ 
lems  with  more  than  one  fault  model?  What  if  the  problem  must  be  formulated  more 
in  terms  of  the  original  system  matrices?  What  happens  when  a  control  is  already 
present  in  the  model?  Could  alternative  cost  functions  be  minimized?  Does  knowl¬ 
edge  of  initial  conditions  impact  the  theory?  This  section  will  address  these  questions 
and  demonstrate  the  flexibility  of  the  algorithm. 

It  should  be  noted  that  while  this  section  uses  Equations  (2.65)  as  the  form  of  the 
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model  around  which  we  describe  variations,  any  of  the  four  forms  could  be  used.  It  is 
not  our  intent  to  promote  one  form  of  the  algorithm  over  another,  merely  to  present 
the  options  available. 

2.4.1  Multiple  Fault  Models 

Many,  if  not  most,  real  world  systems  which  are  represented  as  multi-model  control 
systems  have  more  than  one  fault  model.  Unfortunately,  this  is  one  of  the  types  of 
problems  that  approaches  such  as  [30]  are  unable  to  efficiently  address.  However, 
our  MEDS  algorithm  can  solve  the  problem  in  one  of  two  ways.  For  this  discussion, 
assume  that  three  models  exist  for  the  system:  the  nominal  model,  (i  =  0),  and  two 
fault  models,  (i  =  1,2).  Obviously,  problems  with  more  than  two  fault  models  may 
be  solved  with  either  method  using  combinatorial  extensions. 

The  first  method  applies  the  MEDS  algorithm  in  a  sequential  manner,  an  ineffi¬ 
cient  approach  which  [30]  is  forced  to  take.  Dividing  the  test  period,  [0,  tj],  into  three 
subperiods,  [0,^],  [^,^],  and  [^,t/],  use  the  algorithm  to  solve  model  0  versus 
model  1  in  the  first  subperiod  for  uqi,  model  0  versus  model  2  in  the  second  subperiod 
for  Vq2,  and  model  1  versus  model  2  in  the  final  subperiod  for  vn-  Because  a  detection 
signal  which  is  proper  on  an  interval  is  still  proper  on  any  longer  interval  including 
the  original  interval  (due  to  the  fact  that  an  noise  on  the  test  period  has  the  same 
or  smaller  norm  on  a  shorter  subperiod),  the  composite  v  obtained  by  applying  Vo\  on 
[0,  ^],  then  vo2  on  [^,  ^],  and  then  Vi2  on  [^,t/],  will  be  a  proper  detection  signal. 
It  will  also  be  the  minimum  energy  proper  detection  signal  for  the  problem  via  this 
approach. 

Since  this  method  involves  shorter  test  periods  for  the  individual  comparisons,  it 
has  a  tendency  to  produce  detection  signals  of  higher  than  necessary  energy.  The 
second  method  avoids  this  tendency  by  simultaneously  solving  all  three  subproblems 
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independently  over  the  entire  test  period  for  a  common  v,  an  approach  of  which  many 


prior  methods,  including  [30],  are  incapable.  Consider 

(7*)'^=  mm  /  WvW^dt  (2.86a) 

V,I3o1,Po2,P12  Jq 

subject  to  the  constraints 

,^01  =  Aoixqi  +  Bqiv  +  Moigoi  (2.86b) 

^01  =  —Qoi^oi  —  2^01^01  - Aoi(O)  =  Aoi(i/)  =  0  (2.86c) 

■^01  =  ^  [xoiQoixoi  +xliHoipoi  +  PoiRoiPoi]  ,  ^oi(O)  =  0,  .^oi(t/)  >  1  (2.86d) 
0  =  i?oi^oi  +  2-^m^oi  +  M^Aoi  (2.86e) 

0.01  <  poi  <  0.99  (2.86f) 

Xq2  =  -^02X02  +  Bq2V  +  1^02^02  (2.86g) 

Ao2  =  —Qo2XQ2  -  2^02^02  “  ■d^2Ao2)  Ao2(0)  =  Ao2(t/)  =  0  (2.86h) 

^02  =  ^  [a;o2Qo22;o2  F  x'q2Hq2P02  + /uJ2-Ro2/^02]  ,  ^02(0)  =  0,  Zo2{tf)  >  1  (2.86i) 

0  =  R02P02  +  2HQ2XQ2  +  A<f^Ao2  (2.86j) 

0.01  <  ^02  <  0.99  (2.86k) 

x'i2  =  ^12X12  +  .612'^  +  M12P12  (2.861) 

Ai2  =  — Qi2'ri2  -  -j^Hi2P\2  —  d^r2Ai2)  Ai2(0)  =  Ai2(t/)  =  0  (2.86m) 

Z'u  =  ^  [a;i’2Cl2a;i2  +  x{2Hi2Pl2  +  P\2Rl2Pl2]  ,  ^12(0)  =  0,  ^I2(t/)  >  1  (2.86n) 

0  =  Ri2Pi2  +  \h'(2^i2  +  mI^\i2  (2.86o) 

0.01  <  ^X2  <  0.99  (2.86p) 


where  a  vector^  or  a  matrix^  is  the  unsubscripted  vector /matrix  in  the  simple  model 
(2.65)  when  model  i  and  model  j  are  the  two  models  for  which  the  problem  is  being 
formulated.  Note  that  only  the  detection  signal  is  common  between  models.  This 
independence  increases  the  size  of  the  problem  with  both  the  size  and  number  of 
models. 
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It  should  also  be  noted  that  if  the  goal  is  to  detect  a  fault,  but  it  is  not  important 
to  distinguish  between  fault  models,  the  detection  signal  should  be  minimized  by 
solving  only  those  pairwise  problems  including  the  nominal  model  (  i.e.,  in  the  three 
model  case,  model  0  versus  model  1,  and  model  0  versus  model  2).  The  nominal 
model  will  then  be  distinguishable  from  the  fault  models,  but  individual  fault  model 
outputs  may  overlap  each  other.  These  cases  will  be  discussed  further  in  the  next 
chapter.  Examples  of  multiple  fault  model  problems  are  included  in  Chapter  4. 


2.4.2  Unreduced  Model 


In  addition  to  multiple  fault  models,  many  system  models  have  parameters  which 
represent  some  kind  of  physical  quantity.  Model  reduction  hinders  the  ability  to 
match  parameters  with  real  quantities,  and  so  it  may  be  useful  in  these  cases  to 
formulate  the  problem  more  in  terms  of  original  system  equations.  Consider  the 
characterization  of  not  proper,  (2.10),  after  applying  the  result,  (2.23) 


r^j 

max  min  /  P\fjio{t)f  +  {1  —  P)\iM{t)\‘^  dt  <  1  (2.87a) 

Jo 

with  the  minimum  subject  to 

Xq  =  AqXq  +  Bqv  +  Mo/j-o  (2.87b) 

x'l  —  AiXi  +  B\V  -p  Mip,}  (2.87c) 

0  =  CqXo  —  CiXi  +  NqPo  ~  hlifii-  (2.87d) 


Instead  of  reducing  (2.87)  as  before,  let 
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C  =  [Co  -Cl],  N^[No  -iVij. 

Then  the  characterization  of  not  proper,  (2.87)  becomes 

1 

max  min  -  /  u^Vru  dt  <  1 

o</5<i  2  Vo 

with  the  minimum  subject  to 

x'  =  Ax  +  Bv  +  Mp 
0  =  Cx  +  N  p. 

The  Hamiltonian  for  the  inner  problem  is 


H  =  \p^V^p  +  +  Ax -\-Bv  A-  Mp)  +  Af  {Cx  +  Np).  (2.89) 

Applying  the  necessary  conditions,  (2.43),  to  (2.89),  we  obtain 

0  =  Fpp  +  Xq  +  iV^Ai  (2.90) 

A'  =  -A^Ao-C'^Ai  (2.91) 

as  well  as  (2.88b)-(2.88c).  Thus  we  obtain  an  unreduced  problem  that  can  be  ex¬ 
pressed  in  the  same  form  as  the  reduced  problem.  In  terms  similar  to  (2.65) 

Cf 

(7*)“^  =  min  /  (2.92a) 

Jo 

subject  to  the  constraints 

x'  =  Ax  +  Bv  +  Mp  (2.92b) 

0  =  Cx  +  Np  (2.92c) 

A'  =  -A^Ao-C^Ai,  Ao(0)  =  Ao(t/)  =  0  (2.92d) 

Z'  =  \p^Vpp,  Zi0)  =  0,  Z{tf)>l  (2.92e) 

0  =  Vpp  +  M'^Xo  +  N'^Xi  (2.92f) 

(2.92g) 


(2.88a) 

(2.88b) 

(2.88c) 


0  <  <  1. 
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For  this  formulation  to  be  worthwhile,  the  benefit  obtained  from  association  of 
model  parameters  with  quantities  of  interest  should  outweigh  the  impact  of  the  in¬ 
creased  dimension  of  the  problem.  In  addition,  (2.92b)  and  (2.92c)  comprise  an  index 
one  BVP,  so  optimization  software  used  to  solve  it  must  have  the  capability  to  handle 
DAEs. 


2.4.3  Controlled  Systems 

Even  in  cases  for  which  a  reduced  model  is  acceptable,  the  presence  of  a  known  ref¬ 
erence  control,  u,  may  appear  to  complicate  matters.  In  actuality,  controlled  systems 
can  easily  be  handled  by  the  algorithm.  By  using  the  same  input  channels  for  the 
detection  signal,  v,  and  the  control,  u,  the  problem  goes  from 


to 


min||u||  subject  to  jnax^  J„(/?)  >  1 


(2.93) 


min||u||  subject  to  ^max^  Ju+i,(,S)  >  1.  (2.94) 

Thus  instead  of  v  appearing  in  the  constraints,  u  +  v  would  appear.  Note  that  in  this 
case,  it  is  possible  that  a  zero  detection  signal  may  be  the  minimum  proper.  This 
would  occur  when  the  control  by  itself  is  already  proper. 

If  the  reference  control  does  not  come  in  on  the  same  channels  as  the  detection 
signal  the  differential  equation  becomes 

x'i  =  AjXi  +  BiV  +  EiU  +  MiPi-  (2.95) 

By  letting  w  —  [wi,W2]  =  [t’,w],  we  can  construct  a  problem  similar  to  (2.94) 

min||u;i||  subject  to  max  J„,(/d)  >  1  and  W2  =  u.  (2.96) 
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An  alternative  approach  to  solving  the  controlled  system  is  to  eliminate  u  just 
as  we  eliminate  the  output,  y,  to  obtain  (2.13).  Using  the  expression  for  the  known 
control,  model  reduction  is  accomplished  as  described  in  Section  2.1.2. 

2.4.4  Alternative  Cost  Functions 

Another  complication  occurs  when  the  energy  of  the  detection  signal  is  not  the  desired 
cost  function.  One  possible  alternative  cost  function  is  the  power  of  the  detection 
signal,  with  noise  of  bounded  power.  For  this  case,  the  noise  model,  (2.3),  becomes 

||^,||2=  f' \pi{t)\Ut<Ktf,  i  =  0,l,K>0.  (2.97) 

Jo 

By  retracing  the  problem  development  in  the  first  part  of  this  chaper,  we  see  that  the 
only  differences  between  the  bounded  power  noise/minimum  power  detection  signal 
case  and  the  bounded  energy  noise/minimum  energy  detection  signal  case  are  the 
values  of  the  right  hand  sides  of  the  inequalities  (2.3)  and  (2.97),  and  the  scaling 
of  the  objective  function.  Thus,  for  the  bounded  power/minimum  power  detection 


signal  problem,  (2.65)  becomes 

mm— —  [  (2.98a) 

V,l3  Ktf  7o 

subject  to  the  constraints 

x'  =  Ax  +  Bv  +  Mp  (2.98b) 

A'  =  -Qx-l-Hp-A^X,  A(0)  =  A(t/)  =  0  (2.98c) 

Z'  =  i  ,  Z(0)  =  0,  Z(t,)>Kt,  (2.98d) 

0  =  Rp+\h'^x-FM'^X  (2.98e) 

0  <  <  1.  (2.98f) 


Other  alternative  problems  that  introduce  complications  no  more  difficult  than 


those  above  include 
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•  bounded  power  noise/minimum  energy  detection  signal, 

•  bounded  power  noise/minimum  of  a  function  of  the  detection  signal  (Qx+ 

Rv), 

•  bounded  energy  noise/minimum  of  a  function  of  the  detection  signal  (Q.t+ 

Rv). 

Each  of  the  problems  in  this  list  is  easily  constructed  and  efficiently  handled  by  SOCS. 

2.4.5  Knowledge  of  Initial  Conditions 

As  a  final  variation  on  the  problem,  consider  the  case  in  which  a  weighted  initial 
condition  is  added  to  the  noise  constraint.  In  this  case,  instead  of  the  bounded 
energy  noise  model,  (2.3),  we  have 

\ljii{t)fdt  <  1,  f  =  0, 1  (2.99) 

where  the  Xi(0)  are  the  initial  states  and  the  are  the  weight  matrices.  This  can 
also  be  generalized  to 

Pf 

«Sai,t(a:i(0),/ij)  =  [a;i(0)  -  ai]^Pi7o^[xi(0)  -  Oj]  +  /  |//i(t)|^dt  <  1,  i  =  0, 1  (2.100) 

Jo 

for  some  fixed  vector  Oj.  (2.99)  is  the  case  addressed  in  [30],  and  this  formulation  leads 
to  quite  a  different  method.  The  approach  is  made  possible  by  the  elimination  of  the 
long  axis  in  each  output  set  caused  by  the  free  initial  condition.  Nikoukhah,  et  al.  [30] 
use  standard  Kalman  filtering  combined  with  extensive  Riccati  difl'erential  equation 
theory  to  derive  an  elegant  method  for  computing  the  minimum  energy  detection 
signal  and  the  separability  index.  One  chooses  a  7,  then  solves  a  Riccati  equation 
until  it  diverges.  If  the  divergence  occurs  inside  the  test  period,  increase  7  and  repeat. 
If  the  Riccati  equation  diverges  past  the  end  of  the  test  period,  decrease  7  and  repeat. 
Using  a  bisection  method,  7*  can  be  found  with  any  desired  accuracy.  Once  7*  is 
fixed,  the  minimum  energy  detection  signal  is  computed  by  solving  a  two-point  BVP 


Si{xi{0),pi)  =  Xi{0)'^PiQXi{0)  4-  f 

Jo 
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via  two  other  Riccati  equations.  Thus,  whereas  our  approach  is  direct,  relying  on  the 
basics  of  optimal  control  theory,  the  approach  of  [30]  is  recursive  in  nature,  relying 
on  the  stability  of  Riccati  matrices.  Unfortunately,  the  approach  of  [30]  is  limited  to 
linear  systems  of  only  two  models. 

Our  approach,  however  is  not  only  extendable  to  certain  nonlinear  systems  we 
discuss  in  Chapter  5,  it  can  handle  systems  of  more  than  two  models  as  previously 
mentioned,  and  it  can  handle  the  noise  models  (2.99)  and  (2.100).  In  fact,  the  theory 
for  those  noise  models  is  actually  slightly  simpler  due  to  the  elimination  of  the  long 
axis  in  the  output  sets.  P{x,p,,^)  from  (2.18)  becomes 

P(x,M,/3)  =  /3io(0)Xi}^o((l)  +  (1  -  «ii(OyPi:o'ii(0)  + 

-  /  Qx  +  H It  +  iV R{i  dt  (2.101) 

2  Jq 

but  the  new  term  is  positive  as  long  as  Pj^o  >  0.  Thus,  Ju(/0)  =  mva.P(x,  p,  P)  >  1 
still  characterizes  a  proper  detection  signal.  While  the  boundary  conditions  change 
slightly,  the  new  term  involving  the  initial  conditions  does  not  appear  in  the  varia¬ 
tional  equations.  Thus  the  problem  is  virtually  the  same,  and  we  expect  the  impact  of 
the  weight  matrices  to  diminish  as  the  test  period  increases  in  length.  SOCS  should 
encounter  no  difficulties  from  this  variation. 

2.4.6  Conclusion 

While  other  variations  of  the  basic  problem  exist  which  may  be  more  difficult  to 
accomodate,  it  is  clear  from  the  cases  discussed  in  this  section  that  the  algorithm 
is  quite  flexible  and  can  be  adapted  to  solve  many  types  of  multi-model  problems. 
In  fact,  current  theory  for  the  multiple  fault  model  case  only  provides  for  problem 
development;  it  does  not  provide  a  practical  means  with  which  to  solve  the  problem. 
Our  MEDS  algorithm  not  only  provides  theoretical  development  for  the  multiple  fault 
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model  case,  it  also  supplies  a  simple  method  for  solving  such  problems. 


Chapter  3 


Model  Identification  via  the  Separating 
Hyperplane 

3.1  The  Problem  -  Determining  the  Origin  of  a 
Given  Output 

As  was  shown  in  the  previous  chapter,  the  MEDS  algorithm  guarantees  that  the 
output  sets,  A"{v),  from  the  two  possible  system  models  are  disjoint.  In  other  words, 
given  an  output  from  the  system,  it  is  the  result  of  one  model  or  the  other,  but 
not  both.  The  algorithm  does  not,  however,  tell  us  from  which  model  the  output 
is  derived.  In  this  chapter,  we  will  address  the  model  identification  step,  and  by 
specifying  the  correct  model  for  a  given  output  using  a  separating  hyperplane,  we 
will  complete  the  development  of  our  multi-model  approach  to  fault  detection  and 
model  identification  in  linear  descriptor  systems. 
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3.1.1  Problem  Setup 

Recall  that  the  true  model  of  the  system  is  one  of  two  models 

x'^  =  AiXi  +  BiV  +  Afj/ij  (3.1a) 

Vi  =  CiXi  +  NiPi  (3.1b) 

for  i  =  0  and  1,  where  all  variables  are  as  previously  defined  except  v,  which  is  now 
the  minimum  energy  detection  signal  from  the  MEDS  algorithm.  In  addition,  with 
the  application  of  v,  each  model  has  a  distinct  output,  and  so  the  output,  y,  is  now 
subscripted. 

Recall  also  from  Section  2.1.1,  that  the  output  sets,  A^{v),  are  open  convex  sets. 
The  sets  are  open  because  ||/ri||  <  1  and  the  Ni  are  full  row  rank.  In  effect,  the 

noise  contribution  to  an  output  set  can  be  likened  to  the  addition  of  an  open  “ball” 

(in  the  sense)  to  the  boundaries  of  the  noiseless  output  sets.  The  fact  that  the 
minimum  energy  detection  signal  is  being  applied  implies  that  while  the  output  sets 
are  disjoint,  their  closures  share  at  least  one  common  point  at  any  given  time.  This 
occurs  due  to  the  definition  of  minimum  proper.  Suppose  a  detection  signal  which 
is  infinitesimally  “smaller”  than  the  minimum  energy  detection  signal  is  applied.  In 
this  case,  the  detection  signal  would  not  be  proper,  and  the  output  sets  would  not 
be  disjoint.  In  order  for  the  open  output  sets  to  intersect  with  the  application  of  the 
infinitesimally  smaller  detection  signal,  the  closures  of  the  output  sets  must  intersect 
with  the  application  of  the  minimum  energy  proper  detection  signal. 

Assuming  a  unique  point  of  intersection  of  the  closures  of  the  output  sets,  which 
occurs  when  at  least  one  of  the  sets  is  strictly  convex,  that  point  is  easily  computed. 
Recall  from  the  previous  chapter  that  the  output  equations  were  combined  in  order 
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to  reduce  the  dimension  of  the  model.  That  is 

y  =  CqXq  +  Nopo  (3.2a) 

y  =  CiXi  +  NifTi  (3.2b) 

were  equated  to  each  other  to  eliminate  y.  When  the  optimal  trajectory,  xi,  and 
the  “optimal”  noise,  Hi,  (from  the  optimal  solution  to  the  MEDS  algorithm)  are 
substituted  into  equation  (3.2b),  the  resulting  y  is  the  point  of  intersection  of  the 
closures  of  the  A^{v).  This  is  true  because  the  MEDS  algorithm  actually  makes  use 
of  the  complements  of  the  output  sets  in  the  optimization,  finding  the  detection  signal 
of  smallest  norm  that  does  not  satisfy  the  not  proper  conditions,  as  discussed  in  the 
previous  chapter.  The  complements  of  the  output  sets  are  closed  sets  in  L^,  and  the 
y  that  results  from  inserting  the  solution  of  the  MEDS  algorithm  into  (3.2)  is  on  the 
shared  boundary  of  these  sets.  Note  that  we  could  use  either  equation  of  (3.2)  to 
compute  y,  but  in  the  model  reduction,  part  of  po  is  eliminated;  all  of  pi  is  available 
from  the  optimal  solution.  For  the  remainder  of  this  discussion,  let  the  common  y  be 
called  y. 

Problems  may  result  when  the  closures  of  the  output  sets  intersect  at  more  than 
one  point  at  any  given  time.  Non-unique  intersections  among  convex  sets  can  only 
occur  when  the  sets  have  flat,  parallel  sides  along  their  common  border,  i.e.,  both  sets 
are  not  strictly  convex.  This  geometry  is  present  in  cases  where  both  Ai  matrices  share 
a  common  eigenvalue,  eigenvector  pair.  While  this  occurrence  is  rare,  since  perturbing 
an  element  of  a  matrix  almost  always  changes  all  of  the  eigenvalues  of  that  matrix, 
software  packages  may  fail  in  its  presence  (due  to  a  non-unique  optimal  solution), 
and  thus  we  should  be  prepared  for  it.  To  counteract  its  effect,  we  simply  project  y 
by  multiplying  both  output  equations  by  a  matrix  that  eliminates  the  parallel  part. 
This  operation  will  result  in  lower  dimensional  output  sets  which  have  no  flat,  parallel 
sides.  In  the  remainder  of  our  discussion  we  will  assume  that  such  a  projection  has 
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been  applied  and  the  resulting  lower  dimensional  output  sets  are  strictly  convex. 

3.1.2  The  Separating  Hyperplane 

A  convenient  feature  of  convex  sets  is  the  separaMng  hyperplane  [34].  Given  two 
disjoint  convex  sets,  there  exists  a  hyperplane  that  separates  the  sets;  that  is,  one  set 
is  above  the  hyperplane  while  the  other  set  is  below.  Such  a  hyperplane  exists  for 
the  two  output  sets  A'‘{v),  and  it  contains  the  point  y.  In  fact,  because  the  corners 
of  the  output  sets  are  smoothed  by  the  contribution  of  the  noise  “balls” ,  and  because 
we  have  assumed  a  unique  y,  the  separating  hyperplane  for  the  output  sets  is  tangent 
to  both  sets,  and  at  any  given  time  it  is  unique.  By  inserting  a  known  output  from 
one  of  the  models  into  the  equation  of  the  hyperplane  (defined  by  its  normal  and  a 
point  on  the  plane,  in  this  case  y),  it  can  be  determined  whether  that  set  lies  above 
or  below  the  plane.  By  inserting  an  output,  the  origin  of  which  is  unknown,  into 
the  equation  of  the  hyperplane,  and  subsequently  observing  the  sign  of  the  result,  we 
can  determine  from  which  model  that  output  originates,  and  thus  accomplish  model 
identification. 

Mathematically,  the  existence  of  the  separating  hyperplane  implies  that  there  is 
a  function  a{t)  €  such  that  if  we  define 

<j>{y)  =  {a, y-y)=  [  a{tf[y{t)  -  y(t)]  dt  (3.3) 

Jo 

we  have  that  (j)  is  nonnegative  on  one  M'(u)  and  nonpositive  on  the  other  M'('a)-  The 
function  a{t)  is  the  normal  to  the  separating  hyperplane.  We  call  cj)  the  test  function. 

Numerical  and  system  error  can  cause  the  output  sets  to  overlap  when  v  is  applied. 
To  compensate  for  this  effect  one  would  likely  apply  5v  with  5  a  little  more  than  one, 
resulting  in  output  sets  from  the  two  models  which  are  such  a  distance  apart  that 
their  closures  do  not  intersect.  Thus,  it  will  be  important  to  have  a  test  function  that 
works  for  detection  signals  larger  than  v. 


Chapter  3.  Model  Identification  via  the  Separating  Hyperplane 


72 


Recall  from  Section  2.1.1,  the  output  equation  for  each  model  is 

y  =  CiCi{Biv)  +  CiCi{MiPi)  +  +  Nipi  (3.4) 

where  CiCi{Biv)  is  a  vector  depending  linearly  on  v.  Applying  5v  translates  ^*(0)  by 
5CiCi{Biv).  Using  5  >  1  causes  translation  of  A^{v)  by  (5  —  \)CiCiBiV.  The  vector  y 
was  on  the  boundary  of  Ai{v).  Thus  y  +  {5  -  l)CiCiBiV  is  now  on  the  boundary  of 
Ai{5v)  and  the  two  output  sets  are  disjoint.  As  a  result,  the  supporting  hyperplane 
at  either  y  +  {5  -  1)CoCoBqv  ory  +  {5-  l)CiCiBiV  is  still  a  separating  hyperplane 
(not  unique)  and  the  normal  is  still  a{t).  Therefore,  we  may  define  the  test  function 

My)  =  r  -  W)  -  (5  -  l)9(i))rfi  (3.5) 

Jo 

where  q{t)  =  tCoCoBqv  +  (1  -  t)CiCiBiV  for  a  fixed  0  <  r  <  1.  Note  that  choosing 
q  this  way  with  0  <  r  <  1  makes  the  test  strictly  positive  on  one  output  set  and 
strictly  negative  on  the  other,  which  is  more  computationally  robust.  In  the  examples 
of  Chapter  4  we  choose  r  =  See  Figure  3.1  for  a  finite  dimensional  depiction  of 
this  discussion. 

3.1.3  Approximating  the  Separating  Hyperplane 

Unfortunately,  no  analytic  characterization  exists  for  the  boundaries  of  the  output 
sets  at  y.  This  characterization  is  required  for  direct  computation  of  the  tangent  at 
the  point  of  intersection.  Thus,  the  equation  of  the  separating  (tangent)  hyperplane 
is  hard  to  compute.  One  way  to  compensate  for  the  lack  of  a  characterization  of  the 
output  set  boundaries  at  y  is  to  artificially  force  the  sets  apart.  Suppose  we  have  done 
so,  and  let  the  points  on  the  boundary  of  each  set  which  are  closest  to  each  other  be 
called  yg  and  y^,  respectively.  The  equation  of  the  line  segment  (yg  —  y^)  will  be  the 
normal  to  a  separating  hyperplane  for  the  forced-apart  sets.  This  normal,  along  with 
a  point  on  the  line  segment,  can  be  used  to  define  the  separating  hyperplane.  If  the 
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output  sets  new  separating 

under  larger  V  hyperplane 


separating  under  minimal  v 

hyperplane 


Figure  3.1:  Output  sets  under  application  oft?  and  Sv,  5  >  1 

sets  have  been  separated  correctly,  this  separating  hyperplane  will  approximate  the 
separating  hyperplane  for  the  original  output  sets. 

The  accuracy  of  the  approximation  depends  on  how  the  sets  are  forced  apart. 
One  way  to  do  it  is  to  shrink  the  contribution  of  the  noise  vectors.  Since  the  noise 
contribution  to  the  output  is  a  (possibly  misshapen)  “ball” ,  multiplying  the  noise  by 
a  factor,  e  <  1,  will  force  the  output  sets  apart  in  such  a  way  that  the  accuracy  of 
the  approximation  can  be  made  better  by  selection  of  a  larger  e. 

In  fact,  reducing  the  noise  contribution  is  equivalent  to  increasing  the  detection 
signal.  To  see  this  fact,  consider  a  generic  model  under  the  application  of  minimal 
energy  detection  signal,  v,  and  reduced  noise,  ep. 

x'  = 


y 


Ax  +  Bv  +  Mep 
Cx  +  Nep. 


(3.6a) 

(3.6b) 
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Multiplying  both  equations  by  we  obtain 

(£)'  =  .  (3.7a, 

0  =  C0+JVp.  (3.7b) 

Letting  z  =  f ,  n;  =  ^  and  (5  =  (3.7)  becomes 

z'  ^  Az  +  BSv  +  Mp  (3.8a) 

w  =  Cz  +  Np  (3.8b) 


which  is  just  the  original  generic  model  under  the  application  of  larger  than  minimal 
detection  signal,  5v,  but  with  full  noise  contribution,  p.  This  equivalence  is  important 
in  that  it  allows  us  to  use  the  model  identification  test  function  (3.5)  for  the  reduced 
noise  problem  as  well  as  for  the  larger-than-minimum  energy  detection  signal  problem. 
In  terms  of  the  output  sets,  the  equivalence  may  be  stated  as  follows:  if  Al{v)  is 
the  output  from  model  i  using  detection  signal  v  and  noise  weighting  e,  so  that 
JC^{v)  =  A^{v),  then  Ai,{v)  =  eA^{]v). 

3.1.4  Problem  Statement 

The  discussion  of  the  preceding  section  leads  us  to  a  new  form  for  the  system  models. 

x[  =  AiXi  +  BiV  +  Miepi  (3.9a) 

y,  =  CiXi  +  NiCpi  (3.9b) 

for  i  =  0  and  1.  To  find  the  separating  hyperplane  for  the  output  sets,  we  must  find 
the  points  on  the  boundaries  of  the  closures  of  the  sets  that  are  closest  to  each  other. 
In  order  to  include  the  boundaries  of  the  otherwise  open  output  sets,  we  must  change 
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our  bounded  energy  noise  model  to 

\\Pif<l,  i  =  0,l.  (3.10) 

To  compute  the  normal  to  the  separating  hyperplane,  simply  minimize  ||yo  ~  Z/i|P 
subject  to  (3.9-3.10)  using  an  optimal  control  package.  SOCS  is  quite  capable  of 
handling  this  problem.  The  solutions  and  obtained  can  be  difl'erenccd  and 
normalized  to  compute  the  normal  to  the  separating  hyperplane.  Any  point  on  the 
line  segment  between  y^  and  may  be  used  as  the  defining  point  on  the  hyperplane. 
See  Figure  3.2  for  a  finite  dimensional  depiction  of  this  discussion. 


Figure  3.2:  Output  sets  under  full  and  reduced  noise  contributions 

As  stated  earlier,  the  accuracy  of  the  approximation  obtained  from  this  problem 
can  be  improved  simply  by  selecting  a  larger  e.  A  bound  on  the  error  is  characterized 
in  the  next  section.  With  the  problem  now  fully  defined,  we  present  the  model 
identification  algorithm. 
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3.2  The  Model  Identification  Algorithm 

As  described  in  Chapter  2,  Boeing’s  SOCS  package  [3,  4]  efficiently  solves  the  op¬ 
timization  problem  described  above.  MATLAB,  by  The  MathWorks,  Inc.  [26]  is 
convenient  for  subsequent  computations.  While  we  do  not  use  MAPLE,  by  Water¬ 
loo  Maple,  Inc.  [18]  to  generate  FORTRAN  code  as  for  the  MEDS  algorithm,  higher 
dimensional  problems  will  make  it  desirable  to  do  so.  The  model  identification  (MI) 
algorithm  (with  appropriate  software  in  parentheses): 

1.  Let  u  be  the  minimum  energy  proper  detection  signal  from  the  MEDS  algorithm 
(SOCS) 

2.  Choose  a  value  for  e  <  1 

3.  Perform  constrained  optimization  (SOCS) 

min||yo-yi|P  (3.11a) 

subject  to  the  constraints 


=  AiXi  -h  BiV  -1-  Micpi 

(3.11b) 

Vi 

=  CiXi  -\-  Nitpi 

(3.11c) 

Qi 

=  gi(0)  =  0,  Qiitf)  <  1 

(3.11d) 

for  i  =  0  and  1 

4.  Let  yo,e  closest  points  computed  by  the  optimization 

5.  Compute  a^{t),  the  normal  to  the  separating  hyperplane  (MATLAB) 
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6.  Compute  ydt),  the  point  on  the  separating  hyperplane,  as  the  midpoint  of  the 
line  segment  connecting  and  ^  (i.e.,  "t  =  5)  (MATLAB) 


_  Vo,,  +  2/1, e 

Ve  = - ;; - 


(3.13) 


7.  Let  4>e(z)  =  (a^fZ  —  y^  be  the  test  function.  Then 

4(2/0)  =  (oe,  2/0 -yj  >0 


4(2/1)  =  (fle,  2/1  -  2/t)  <0 

or  vice  versa,  where  yi  is  an  unknown  output  from  model  z,  i  =  0  or  1 

8.  Suppose  a  known  output  from  model  0  produces  a  positive  test  function  value. 
Then  if  an  unknown  output  produces  a  positive  test  function  value,  it  derives 
from  model  0.  If  the  unknown  output  produces  a  negative  test  function  value, 
it  derives  from  model  1.  If  a  zero  test  function  value  is  produced,  an  error  has 
occurred  and  a  smaller  value  of  e  should  be  selected. 

Note  that  as  e  — )•  1,  the  computed  normal  approaches  the  true  normal.  Unfortu¬ 
nately,  numerical  error  in  (3.12)  increases  when  (1  -  e)  is  very  small  due  to  division 
by  small  numbers.  Thus,  (1  —  e)  should  be  chosen  sufficiently  large  to  ensure  a  dis¬ 
crete  distance  between  output  sets,  in  order  to  reduce  the  effect  of  nnnK'iical  error. 
However,  (1  —  e)  must  be  small  enough  to  ensure  that  the  skewing  effect  from  the 
weight  matrices  on  the  noise  inputs  are  included,  in  order  to  increase  the  accuracy  of 
the  computed  normal. 

The  following  theorem  summarizes  the  accuracy  of  the  computed  test  function 

[12]. 

Theorem  3.1.  Let  a^{t)  be  the  normal  from  the  MI  algorithm,  using  0  <  e  <  1 
and  let  yo.oyi.e  values  of  yo,yi  tho,t  give  the  m,inim,um  distance.  Let  y^  = 
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i  (2/0,6  +  yi,e)-  hyperplane  test  be 

(l),{w)  =<  ae{t),  w-y,>  .  (3.14) 

Let  6(e)  =  _  Then  there  is  a  constant  K  so  that 

6(e)  <  K(1  -  e)  (3.15) 

and  if  w  E  (^^(t')  U  A\(f6)),  then 

M'^)>6(e)  weA'^,(v)  (3.16) 

Proof  (from  [12])  (3.16)  and  (3.17)  follow  from  noting  that  (feiw)  =  6(e)  is  the 

supporting  hyperplane  of  A^^(v)  at  yo,6  while  (/»e(t«)  =  —6(e)  is  a  parallel  supporting 
hyperplane  of  .Ag  (u)  aty^  ,,.  K  can  be  taken  as  ||Co>Co.^o  +  .^o||  +  i|C'i£iMi  +  A/il|.  □ 

K  is  related  to  the  linear  operators  applied  to  the  noise  vectors  and  is  thus  a 
measure  of  the  amount  of  skewing  that  occurs  in  the  output  sets  due  to  the  noise 
input.  In  practice,  the  tests  are  often  much  better  than  the  theorem  indicates  but 
the  result  allows  for  highly  skewed  convex  sets. 

It  is  important  to  note  that  the  MI  algorithm  can  be  used  if  the  applied  detection 
signal  is  larger  than  the  minimal  proper.  If  v  is  proper  and  not  minimal,  then  one 
can  set  e  =  1  in  the  algorithm  and  still  obtain  a  nonzero  distance  between  the  output 
sets,  due  to  the  equivalence  between  the  reduced  noise  and  increased  detection  signal 
approaches.  In  fact,  one  can  use  a  combination  of  both  reduced  noise  input  and 
increased  detection  signal,  which  is  useful  when  it  is  not  known  whether  the  detection 
signal  to  be  applied  is  minimal  or  not.  This  will  be  important  when  we  discuss  the 
case  with  multiple  fault  models.  A  sample  driver  file  for  the  MI  algorithm  coded  in 
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SOCS,  along  with  a  MATLAB  m-file  for  subsequent  computations  are  in  Appendix 
A. 


3.3  Variations 

Like  the  MEDS  algorithm,  applications  of  the  MI  algorithm  are  limited  by  the  as¬ 
sumptions  and  formulation  of  the  problem.  Fortunately,  the  assumptions  and  for¬ 
mulation  of  the  model  identification  problem  are  much  more  conducive  to  extensions 
than  the  minimum  energy  detection  signal  problem.  Extensions  are  easily  made  to 
multiple  fault  models,  alternative  formulations,  pre-existing  controls,  alternative  cost 
functions,  and  different  initial  conditions. 

3.3.1  Multiple  fault  models 

The  result  of  the  MEDS  algorithm  consists  of  a  group  of  disjoint  output  sets,  one  for 
each  possible  model.  Since  we  have  no  a  priori  knowledge  of  the  spatial  locations  of 
these  sets,  the  MI  algorithm  must  be  used  to  separate  each  pair  of  sets.  If  there  are 
n  possible  models,  then  there  will  be  separating  hyperplanes.  Each  output  set 

will  exhibit  a  unique  combination  of  test  function  signs  (positive/negative),  for  those 
test  functions  which  can  discriminate  that  output  set. 

For  example,  take  the  case  in  which  three  models  exist  for  the  system,  and  let 
the  output  sets  of  the  models  be  called  Tq)  ^'2-  Let  the  hyperplane  sepa¬ 

rating  Yq  and  Yi  be  called  Hoi,  separating  To  and  Y2  be  called  H02,  and  separat¬ 
ing  Ti  and  Y2  be  called  Hu-  Finally,  let  the  signs  of  the  test  function  for  output 
y  be  {sgn{Hoi),sgn{Ho2),sgn{Hu))y-  Suppose  the  known  output  from  To  exhibits 
(+)-)*))  from  Ti  exhibits  (-,*,+),  and  from  T2  exhibits  (*,+,-),  where  *  indi¬ 
cates  the  inability  of  that  test  function  to  discriminate  the  output  set.  Then,  a  valid 


Chapter  3.  Model  Identification  via  the  Separating  Hyperplane 


80 


unknown  output  exhibiting  any  possible  combination  of  test  function  signs  will  fall 
into  one  and  only  one  of  the  output  sets.  Note  that  it  is  not  possible  for  certain 
combinations  of  test  function  signs  to  be  exhibited  by  a  valid  output,  (+,+,+),  for 
instance. 

It  should  be  noted  that  the  minimum  energy  detection  signal  for  the  multiple  fault 
model  problem  may  not  be  minimal  for  any  of  the  pairwise  problems.  It  will  always 
be  proper  due  to  the  construction  of  the  combined  problem,  but  it  may  be  larger 
than  required.  In  addition,  it  may  not  be  apparent  for  which  pairwise  problem  the 
detection  signal  is  larger  than  minimal.  While  approaches  such  as  that  of  [30]  must 
use  the  minimal  energy  detection  signal,  the  algorithm  and  test  function  described 
in  this  chapter  can  still  be  applied  when  the  detection  signal  may  not  be  minimal 
energy. 

Finally,  if  the  goal  is  to  detect  a  fault,  but  it  is  not  important  to  distinguish  be¬ 
tween  fault  models,  the  detection  signal  should  be  minimized  by  solving  only  those 
pairwise  problems  including  the  nominal  model  (  i.e.,  in  the  three  model  case,  model 
0  versus  model  1,  and  model  0  versus  model  2).  The  nominal  model  will  be  distin¬ 
guishable  from  the  fault  models,  but  more  than  one  separating  hyperplane  will  be 
required.  Outputs  from  the  fault  models  may  wrap  around  the  nominal  model,  in 
which  case  a  single  hyperplane  will  not  have  the  nominal  model  on  one  side  and  all 
fault  models  on  the  other  side.  Examples  of  the  multiple  fault  model  problem  are 
included  in  the  next  chapter. 

3.3.2  Alternative  Formulations 

Since  the  MI  algorithm  formulates  the  problem  in  terms  of  the  original  system  ma¬ 
trices,  and  no  reduction  in  system  dimension  is  attempted,  the  problem  is  already 
in  unreduced  form.  Thus,  since  parameters  representing  physical  quantities  are  not 
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combined  and  sparsity  is  not  lost,  the  question  of  whether  or  not  to  reduce  the  model 
is  not  an  issue  as  it  was  in  the  MEDS  algorithm.  While  alternative  formulations 
are  probably  available,  none  could  describe  the  problem  as  plainly,  and  in  such  a 
straightforward  manner  as  the  present  formulation. 

3.3.3  Controlled  Systems 

As  was  the  case  for  the  MEDS  algorithm,  the  presence  of  a  known  control  does 
not  present  any  difficulty  for  the  MI  algorithm.  The  output  of  the  MEDS  algorithm 
combines  the  known  control  with  the  detection  signal.  This  combined  signal  is  merely 
a  known  input  to  the  MI  algorithm,  and  the  fact  that  a  control  is  present  is  transparent 
to  the  algorithm.  Other  types  of  controls  as  described  in  Chapter  2  arc  equally  as 
transparent. 

3.3.4  Alternative  Cost  Functions 

The  issue  of  alternative  cost  functions  arose  in  the  MEDS  algorithm  because  viable 
alternative  costs  exist  for  that  problem.  In  the  MI  algorithm,  the  cost  function  is 
merely  a  means  by  which  to  compute  the  separating  hyperplane  via  the  two  closest 
points  on  the  closure  of  each  output  set.  Thus  it  is  the  closest  points  which  are  of 
interest,  and  not  the  optimal  cost.  Therefore,  the  issue  of  alternative  cost  functions 
is  not  important  at  all  to  the  model  identification  problem. 

3.3.5  Knowledge  of  Initial  Conditions 

The  variation  of  the  noise  model  presented  in  the  “Variations”  section  of  Chapter  2  is 
as  transparent  to  our  MI  algorithm  as  it  is  to  the  MEDS  algorithm.  The  only  change 
occurs  in  the  boundary  conditions  for  the  differential  equation. 
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Recall,  however  that  variations  on  the  initial  conditions  led  Nikoukhah,  et  al.  [30] 
in  a  different  direction  for  computing  v.  While  the  separating  hyperplane  approach  is 
also  utilized  in  [30],  the  method  for  computing  it  is  quite  different.  Instead  of  formu¬ 
lating  an  optimization  problem,  one  of  the  Riccati  equations  solved  while  computing 
V  is  used  along  with  a  new  BVR  The  normal  to  the  separating  hyperplane  is  shown  to 
be  related  to  the  Lagrange  multiplier  of  the  detection  signal  Riccati  equation.  Care 
must  be  taken  to  avoid  introducing  large  errors  into  the  solution  via  the  integra¬ 
tion,  but  otherwise  the  computation  of  the  normal  is  a  simple  BVP  calculation.  The 
shortcoming  of  the  approach  is  that  it  must  use  v,  and  not  a  larger  multiple.  By  its 
nature,  the  calculation  uses  the  detection  signal  Riccati  equation  that  produces  v,  so 
Sv,  where  5  >  1,  is  not  available.  The  MI  algorithm  has  already  been  shown  to  be 
robust  to  the  use  of  larger  multiples  of  v. 

3.3.6  Conclusion 

It  is  clear  from  the  problem  formulation,  the  algorithm  description,  and  the  above 
exploration  of  possible  variations  to  the  model  identification  problem  that  it  is  much 
simpler  and  more  straight  forward  than  the  minimum  energy  detection  signal  problem. 
The  question  of  flexibility,  i.e.,  how  easy  it  is  to  adapt  the  algorithm  to  a  larger  set 
of  problems,  is  therefore  quite  easily  answered:  the  MI  algorithm  can  be  adapted  to 
handle  any  problem  the  MEDS  algorithm  can  handle.  As  was  stated  in  the  previous 
chapter,  while  current  theory  for  the  multiple  fault  model  case  only  provides  for 
problem  development,  it  does  not  provide  a  practical  means  with  which  to  solve  the 
problem.  Our  MEDS  and  MI  algorithms  not  only  provide  theoretical  development 
for  the  multiple  fault  model  case,  they  also  supply  a  simple  method  for  solving  such 
problems. 
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Examples  and  Analysis  of  Results 

4.1  The  Complete  Problem  and  Algorithm 

Before  beginning  a  discussion  of  the  various  examples  to  which  the  MEDS  and  MI 
algorithms  have  been  applied,  it  is  beneficial  to  review  the  complete  problem  and 
combined  algorithm.  Recall  from  Chapter  2,  that  the  true  model  of  the  system  is  one 
of  two  models 

x-  =  AiXi  +  BiV  +  Mifii  (4.1a) 

y  =  CiXi  +  Nifii  (4.1b) 

for  i  =  0  and  1.  Our  first  goal  is  to  apply  the  minimum  energy  detection  signal,  v,  such 
that  the  (convex)  output  sets  of  the  two  models  are  disjoint.  Thus,  a  given  output 
may  be  derived  from  only  one  model.  Our  second  goal  is  to  compute  the  equation 
of  the  hyperplane  which  separates  the  two  output  sets.  From  this  equation  we  define 
the  test  function.  The  substitution  of  known  outputs  from  each  model  into  the  test 
function  will  indicate  the  sign  (+/— )  that  each  set  will  exhibit.  The  substitution  of 
an  output  of  unknown  origin  into  the  test  function  will  result  in  a  positive  number 
if  it  is  derived  from  one  model  and  a  negative  number  if  it  is  derived  from  the  other 
model.  Thus  the  correct  model  for  the  system  will  be  identified. 
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These  two  goals  are  accomplished  by  the  fault  detection  and  model  identification 
(FDMI)  algorithm  (the  combination  of  the  MEDS  and  the  MI  algorithms),  repeated 
below  (note:  only  the  Version  Two,  non-Riccati  form  is  repeated  here,  see  Chapter  2 
for  other  developed  forms) 

1.  Perform  QR  decomposition  on  Nf 

Nl  =  QiRi  (4.2) 

where  the  Qi  are  unitary  matrices.  Thus 

Ni  =  RjQj  (4.3) 


2.  Perform  constant  orthogonal  coordinate  changes  on  fXi 


(a)  Let  Rf  =  [Ni  0],  where  Ni  is  invertible 


(b)  Let  Qj Pi  = 


with  the  same  partitioning  as  Rj.  Thus 


NiPi  =  [Ni  0] 


(4.4) 


(c)  Let  MiQ^  ^  =  [Mj  Mi]  with  the  same  partitioning  as  Rj.  Thus 


■ITT  TTi  I 


MiPi  =  MiQi  ^Qj Pi  =  [Mi  Mi 


(4.5) 


Note  the  system  model,  (4.1),  becomes 


x'i  =  AiXi  +  BiV  +  MiPi  +  MiPi 

y  =  CiXi  +  NiPi 


3.  Reduce  model  dimension  by  eliminating  y  and  p^ 


(4.6a) 

(4.6b) 
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(a)  Combine  both  equations  (4.6b)  for  y,  solve  for  and  substitute  into 
(4.6a),  z  =  0 

x'q  =  (ylo  ~  MqNo  ^o)^o  +  MqNq  CiXi  +  BqV  +  Mom  +  MoN q  N i/ij 

(4.7) 

(b)  Let 


{  Xo\ 

Ao-MoNo'Co  MoN 0^ Cl 

,  5  = 

Bo 

a;  = 

,  A  = 

1 

O 

> 

1 _ 

Bi  _ 

lr,\ 

_ _  1 

" 

Mo  MoNo  Ni 

0 

M  = 

,  = 

Ml 

0  Ml 

Ml 

U/ 

Note  the  system  model,  (4.6),  is  now 

x'  —  Ax  +  Bv  +  Mp  (4.8) 


4.  Compute  new  system  matrices 

(a)  Let  C  =  [No^Co  -  Nq^Ci]  and  N  =  No^[0  0],  with  the  columns  of 

N  conforming  to  p 

(b)  Let  H  =  -4/3C'^N  and  Q  =  2PC'^C 

(c)  Let 


R  =  2 


pi  0  0 

0  {1-P)I  +  Pn'^,n/No'Ni  0 

0  0  (1-^)/ 


with  its  rows  and  columns  conforming  to  p 
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(d)  Compute  A,  B,  M,  H,  Q,  and  R 

5.  Perform  constrained  optimization.  Solve 

Cs 

=  min  /  ||f  (4.9a) 

7o 

subject  to  the  constraints 

x'  =  Ax  +  Bv  +  Mp  (4.9b) 

A'  =  -Qx  -  \hp  -  A(0)  =  X{tj)  =  0  (4.9c) 

Z'  =  \[x^Qx  +  x'^Hp^p^Rp],  ^(0)  =  0,  Z{tf)>l  (4.9d) 

0  =  Rp4-\e'^xEM^\  (4.9e) 

0.01  <^<  0.99  (4.9f) 

6.  Let  V  be  the  minimum  energy  proper  detection  signal 

7.  Choose  a  value  for  e  <  1 

8.  Perform  constrained  optimization.  Solve 

min||yo -2/i|P  (4.10a) 

subject  to  the  constraints 

xf^  —  AiXi  +  BiV  +  Micpi  (4.10b) 

Hi  =  CiXi  +  Nicpi  (4.10c) 

9i(0)  =  0.  Qii^f)  <  1  (4.10d) 

for  i  =  0  and  1 


9.  Let  po.e  j  be  the  closest  points  computed  by  the  optimization 
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10.  Compute  ac(t),  the  normal  to  the  separating  hyperplane 

l/l,e 

ac  = 

y0,e  -  Vhe 


(4.11) 


11.  Compute  y^(t),  the  point  on  the  separating  hyperplanc,  as  the  midpoint  of  the 
line  segment  connecting  yo,e  2) 


Ve 


1/0,6  +  yi,e 
2 


(4.12) 


12.  Let  (f)e{z)  =  {a^,z  —  y^)  be  the  test  function.  Then 

4(1/0)  =  (oe,  I/O -i/c)  >0 


4(1/1)  =  (a6,yi -l/e)  <0 

or  vice  versa,  where  yi  is  an  unknown  output  from  model  ?  =  0  or  1 

13.  Suppose  a  known  output  from  model  0  produces  a  positive  test  function  value. 
Then  if  an  unknown  output  produces  a  positive  te.st  function  value,  it  derives 
from  model  0.  If  the  unknown  output  produces  a  negative  test  function  value, 
it  derives  from  model  1.  If  a  zero  test  function  value  is  produced,  an  error  has 
occurred  and  a  smaller  value  of  e  should  be  selected. 


4.2  Introduction  of  Software 

As  alluded  to  in  previous  chapters,  Boeing’s  Sparse  Optimal  Control  Software  (SOCS) 
is  the  main  software  in  which  the  optimal  control  problems  of  the  FDMl  algorithm 
have  been  coded  for  this  thesis.  System  matrices  are  first  reduced  using  MATLAB. 
The  reduced  system  matrices  from  MATLAB  are  inserted  into  the  appropriate  FOR¬ 
TRAN  subroutines.  The  dimensions  of  the  reduced  model’s  matrices  are  fed  into 
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MAPLE  for  the  construction  of  the  constraint  equations.  MAPLE’s  FORTRAN 
translator  is  used  to  convert  these  equations  into  FORTRAN  code  which  are  then 
pasted  into  FORTRAN  subroutines  for  compilation  and  execution.  MAPLE  is  used 
to  compute  the  constraint  equations  symbolically,  so  that  the  problems  may  by  kept 
in  parameterized  form.  MATLAB  is  used  to  analyze  and  plot  the  output  from  the 
SOCS  routines 

4.2.1  SOCS  Parameters 

In  addition  to  the  specific  parameters  for  a  given  problem,  SOCS  allows  the  user  to  set 
various  general  parameters  which  control  the  optimization  method  and  convergence 
tolerances.  In  Chapter  1,  we  described  several  discretization  methods  implemented  by 
SOCS.  By  setting  two  control  parameters  we  have  selected  the  Compressed  Hermite- 
Simpson  discretization  method  for  all  examples.  While  it  is  possible  to  substitute 
another  method  to  solve  the  resulting  finite  dimensional  problem,  we  have  not  done  so, 
but  instead  have  used  the  default  SQP  method  supplied  in  SOCS.  This  combination 
of  discretization  method  and  nonlinear  program  solver  exhibits  good  convergence 
properties  and  efficient  use  of  central  processor  unit  (CPU)  time  and  is  well  suited 
to  the  types  of  differential  equations,  and  and  algebraic  constraints  present  in  our 
problem  set.  Options  that  control  the  operation  and  convergence  of  the  optimization 
program  are: 

•  Sparsity  -  If  set  to  “sparse,”  the  program  takes  advantage  of  sparsity  in 
user  supplied  constraints.  Our  constraints  are  sparse,  so  this  option  is 
used. 

•  Initial  guess  type  -  The  user  has  several  options  available  to  supply  an 
initial  guess  for  the  optimization  program.  Choices  range  from  a  linear 
interpolation  between  endpoint  values,  to  construction  of  an  initial  guess 
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from  a  user-supplied  B-spline  definition  or  explicit  calculation  at  each  of 
the  initial  mesh  points.  All  of  our  examples  use  the  linear  interpolation 
option.  While  this  often  results  in  an  infeasible  initial  guess,  in  every 
case  the  program  rectifies  this  condition  after  a  few  constraint  satisfaction 
iterations. 

•  Initial  mesh  size  -  The  user  sets  the  size  of  the  initial  mesh.  To  reduce 
unnecessary  computation,  the  initial  mesh  should  be  the  coarsest  possible 
that  allows  the  program  to  attain  constraint  satisfaction  after  a  few  iter¬ 
ations.  If  the  initial  guess  is  already  feasible,  the  initial  mesh  size  should 
be  left  at  the  default  value  of  10.  Several  of  our  examples  require  a  finer 
initial  mesh  than  the  default  in  order  to  attain  constraint  satisfaction. 

•  Discretization  order  and  stage  number  -  When  set  to  the  appropriate 
values,  the  program  uses  a  high  order  discretization  method  and  stage 
number  for  mesh  refinement.  While  not  used  in  our  examples,  this  option 
may  be  useful  in  higher  dimensional  problems. 

•  Tolerances  -  The  user  may  set  tolerances  for  the  relative  error  in  the 
objective  function  and  the  differential  equation  constraints,  as  well  as 
various  absolute  errors.  These  tolerances  may  be  set  to  different  values  for 
the  two  different  phases  of  the  optimization  (discretization  and  nonlinear 
program  solver).  Several  of  our  tolerances  are  set  tighter  than  default 
values  to  obtain  more  accurate  solutions,  resulting  in  modest  increases  in 
CPU  times. 

Samples  and  discussions  of  the  MATLAB,  MAPLE,  and  SOCS  routines  used  to  gen¬ 
erate  the  results  of  this  chapter  are  depicted  in  Appendix  A. 
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4.2.2  Choosing  a  Value  of  e 

In  the  previous  chapter,  we  mentioned  that  difficulty  may  arise  in  choosing  the  value 
of  the  noise  multiplier,  e,  in  the  model  identification  part  of  the  algorithm.  There  we 
noted  that  as  e  ->  1,  the  computed  normal  to  the  separating  hyperplane  approaches 
the  true  normal,  but  numerical  error  increases  also.  We  concluded  that  (1  —  e)  should 
be  chosen  sufficiently  large  to  ensure  a  discrete  distance  between  output  sets,  in 
order  to  reduce  the  effect  of  numerical  error.  However,  (1  —  e)  must  be  small  enough 
to  ensure  that  the  skewing  effect  from  the  weight  matrices  on  the  noise  inputs  are 
included,  in  order  to  increase  the  accuracy  of  the  computed  normal. 

To  test  the  effect  of  different  values  of  e  on  the  accuracy  of  the  computed  normal, 
we  ran  several  different  models  with  different  values  of  e.  We  found  that  the  normal 
is  not  very  sensitive  to  the  value  of  e  as  long  as  e  is  not  too  small.  To  ensure  a  discrete 
distance  between  output  sets  without  reducing  the  skewing  effect  of  the  coefficient 
matrices  on  the  noise  vectors,  we  chose  e  =  0.7  for  all  examples.  A  sample  plot  of  the 
computed  normal  of  one  example  problem  for  e  =  0.3, 0.5, 0.7, 0.9  is  shown  in  Figure 
4.1. 


4.3  Introduction  of  Examples 

To  test  the  operation  of  the  algorithm,  as  well  as  to  examine  how  results  vary  with 
the  problem,  we  have  applied  the  FDMI  method  to  fourteen  examples.  Four  of  these 
examples  are  one-dimensional  problems,  and  are  examined  to  shed  light  on  the  per¬ 
formance  of  the  algorithm  on  stable-to-stable  and  stable-to-unstable  fault  situations. 
One  of  these  examples  is  coded  in  all  four  formulations  of  the  algorithm  in  order  to 
compare  the  relative  computational  performance  of  the  different  formulations.  Seven 
examples  are  two-dimensional  problems,  and  are  examined  to  shed  light  on  several 
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phenomena  of  interest.  These  phenomena  include: 

•  V,  j*,  and  {t) ,  and  how  each  varies  with  the  problem  and  the  interval 
length, 

•  how  results  differ  between  stable-to-stable  and  stable-to-unstable  oscilla¬ 
tory  systems  (i.e.,  with  imaginary  or  complex  eigenmodes), 

•  whether  the  algorithm  will  work  for  systems  with  common  modes  (i.e., 
unobservable  systems). 

Another  example  is  three-dimensional,  a  real  world  control  system;  it  is  examined 
to  determine  the  efficiency  of  the  algorithm  for  larger  scale  systems.  The  last  two 
examples  are  each  systems  with  a  nominal  model  and  two  fault  models.  One  of  these 
is  one-dimensional  and  the  other  is  two-dimensional.  These  examples  are  included  to 
demonstrate  how  the  algorithm  handles  multiple  fault  model  systems.  Overall,  the 
example  suite  provides  the  basis  for  a  comprehensive  analysis  of  the  FDMI  algorithm, 
its  strengths,  as  well  as  a  few  of  its  weaknesses. 

4.4  One-Dimensional  State  Examples 

To  begin  our  analysis,  we  look  at  four  one-dimensional  examples.  The  very  first 
example  is  coded  in  all  four  formulations  of  the  algorithm.  This  is  done  both  to 
confirm  results  between  formulations  and  to  attempt  to  determine  if  one  formulation 
is  more  efficient  than  the  others.  The  remaining  examples  shed  light  on  the  shape 
and  energy  of  the  minimum  energy  detection  signal,  v,  as  well  as  the  shapes  of  the 
normal  to  the  separating  hyperplane,  a^it),  the  midpoint  of  the  shortest  line  segment 
between  the  output  sets,  y^{t),  and  the  separability  index,  7*,  for  one-dimensional 
problems. 
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4.4.1  Primary  One-Dimensional  Example 

We  begin  with  the  simplest  problem  for  which  all  terms  appear. 

Example  4.1.  [Change  of  eigenvalue,  sto,ble-to- stable]  This  problem,  corresponds 
to  the  case  where  there  has  been  an  internal  change  in  some  system  parameter  such 
as  friction  in  a  joint  or  resistance  of  a  resistor. 


rr'o 

=  —2Xq  +  +  //2 

(4.13a) 

y 

=  ^0  + 

(4.13b) 

x[ 

=  ~-X\  V  ^  fI/[ 

(4.13c) 

y 

(4.13d) 

This  problem,  as  well  as  each  subsequent  example,  is  formulated  such  that  the 
constant  change  of  coordinates  described  in  the  algorithm  is  unnecessary.  Model 
reduction  is  much  more  straightforward  with  this  formulation,  but  no  generality  is 
lost  in  the  process.  Figures  4. 2-4. 3  depict  F  for  =  1, 10, 20, 100  respectively.  Figure 
4.4  shows  V  for  tj  =  20, 100  plotted  to  the  same  scale. 


Figure  4.2:  v,  for  Example  4.1:  tj  —  1  (left),  tj  =  10  (right) 
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Figure  4.3:  v  for  Example  4.1:  tf  =  20  (left),  =  100  (right) 


Figure  4.4:  v  for  Example  4.1:  tj  =  20, 100 
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Figure  4.5  shows  the  relationship  between  j*  and  the  interval  length.  Table  4.1 
compares  7*  and  v  for  various  interval  lengths. 


Figure  4.5;  7*  for  Example  4.1  as  a  function  of  if 


Table  4.1:  7*  and  ||t;||  for  Example  4.1:  tf  =  1, 10,20, 100 


tf 

7* 

V 

1 

0.02075 

48.1919 

10 

0.18719 

5.3423 

20 

0.19498 

5.1288 

100 

0.19736 

5.0669 

As  the  plots  and  the  table  show,  the  energy  of  v  decreases  as  the  interval  length 
increases.  However,  once  a  threshold  interval  length  is  reached,  no  significant  de¬ 
crease  in  the  energy  of  v,  or  increase  in  7*  occurs.  It  should  be  noted  that  as  the 
interval  length  decreases,  7*  becomes  very  small,  indicating  the  difficulty  involved  in 
distinguishing  between  the  two  models  on  very  short  intervals.  Since  ||u||  is  equal  to 
the  reciprocal  of  7*,  the  energy  of  the  detection  signal  will  be  extremely  large  on  very 
short  intervals.  The  threshold  for  Example  4.1  appears  to  be  near  tj  =  lb,  so  that  in 


Chapter  4.  Examples  and  Analysis  of  Results 


96 


the  case  where  tj  =  I  the  energy  of  v  is  quite  significant. 

As  to  the  shape  of  v  on  different  intervals,  Figure  4.6  compares  v  for  the  different 
intervals  after  rescaling  in  both  magnitude  and  duration.  The  innermost  line  is  v  for 
tf  =  1,  continuing  outward  so  that  the  outermost  line  is  u  for  =  100.  It  is  clear 
that  V  is  not  the  same  function  on  different  intervals.  In  [29]  it  was  observed  that 
V  for  the  discrete  problem  begins  to  resemble  the  product  of  sine  functions  as  the 
interval  lengthens.  Figure  4.6  also  demonstrates  this  resemblance,  though  in  a  more 
approximate  sense  than  was  demonstrated  in  [29]. 


Figure  4.6:  Rescaled  v  for  Example  4.1:  tf  =  1, 10, 20, 100 

Figure  4.7  gives  the  y^{t)  and  ae(t)  for  various  values  of  e.  For  Example  4.1,  the 
noise  input  is  not  skewed  at  all  by  its  coefficient  matrix,  so  we  would  expect  the 
separating  hyperplane  to  be  insensitive  to  the  value  of  e.  Figure  4.7  clearly  shows 
that  y^{f)  and  a^it)  do  not  vary  significantly  across  the  full  spectrum  of  possible  values 
of  e  for  this  problem. 
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Figure  4.7:  y^{t)  and  ac{t)  for  Example  4.1:  e  =  0.3, 0.5,  0.7, 0.9 

As  mentioned  previously,  Example  4.1  is  coded  in  all  four  formulations  of  the 
FDMI  algorithm.  The  results  discussed  above  are  from  the  Version  One,  non-Riccati 
form  of  the  algorithm,  but  thorough  comparisons  have  shown  that  the  results  are 
equivalent  from  all  four  forms.  Table  4.2  summarizes  the  performance  of  all  four 
formulations:  Version  One,  non-Riccati  (InR)  and  Riccati  (IR),  and  Version  Two, 
non-Riccati  (2nR)  and  Riccati  (2R).  Each  was  run  on  the  intervals  [0, 1],  [0, 10],  and 
[0, 20]  for  a  total  of  twelve  runs.  Ten  of  the  runs  use  an  initial  mesh  size  of  50.  For 
tf  =  1,  the  2R  formulation  would  not  converge  using  this  initial  mesh  size,  as  was  also 
the  case  for  =  20  in  the  InR  formulation.  Both  converged  using  an  initial  mesh  size 
of  11,  so  that  value  was  substituted.  Those  entries  in  the  table  are  in  boldface  type. 
Non-convergence  of  the  SOCS  program  occurs  occasionally  and  is  usually  due  to  the 
combination  of  infeasible  initial  conditions  and  a  failure  to  resolve  system  dynamics 
on  a  specific  mesh.  In  addition,  a  few  of  the  problems  produced  early  program  termi¬ 
nations  due  to  the  magnitude  of  the  objective  function.  With  an  objective  function 
on  the  order  of  10“'*,  the  change  in  objective  function  values  between  iterations  is 
commonly  on  the  order  of  10"^.  The  default  SOCS  objective  function  error  tolerance 
is  10“®,  and  thus  productive  reductions  in  the  objective  function  may  be  lost  due  to 
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program  termination  before  they  occur.  To  counteract  this  effect,  the  objective  func¬ 
tion  for  those  problems  exhibiting  this  phenomenum  is  rescaled  by  10®.  The  scaling 
factor  was  removed  before  the  data  was  tabulated.  Later,  we  refer  to  this  rescaling 
of  the  objective  function  as  conditioning. 


Table  4.2:  Formulation  comparison  of  Example  4.1:  t/  =  1, 10, 20 


tf  - 

1 

10 

20 

Form 

CPU  Time 

Iterations 

CPU  Time 

Iterations 

CPU  Time 

Iterations 

InR 

61.55 

3 

96.16 

3 

67.04 

6 

2nR 

181.56 

4 

67.54 

3 

290.94 

4 

IR 

72.89 

4 

68.6 

3 

130.14 

4 

2R 

138.17 

6 

67.9 

3 

125.34 

4 

As  the  table  indicates,  results  were  not  consistently  better  for  any  formulation. 
While  formulation  InR  is  as  good  as,  or  better  than  the  others  in  four  of  the  six 
columns,  it  is  worse  by  50%  in  the  other  two  columns.  Since  there  is  no  clear  winner, 
and  since  it  is  also  not  our  intent  to  recommend  one  formulation  over  any  other, 
we  chose  to  run  the  remaining  models  in  formulations  InR  or  2nR.  These  formula¬ 
tions  provide  a  more  direct  correlation  with  the  original  problem  parameters  and  are 
simpler  to  encode.  In  addition,  neither  form  requires  Riccati  equations.  The  direct 
transcription  approach  taken  by  SOCS  converts  every  problem  into  a  boundary  value 
problem,  even  those  in  Riccati  form.  This  translation  removes  all  of  the  usual  ad¬ 
vantages  of  the  Riccati  approach.  Formulation  2nR  provides  excellent  compatibility 
between  the  minimum  energy  detection  signal  and  model  identification  parts  of  the 
algorithm,  while  some  rescaling  of  results  is  required  in  formulation  InR.  For  this 
reason,  our  higher  dimensional  examples  are  coded  only  in  formulation  2nR.  The 
formulation  used  is  transparent  to  the  results  presented  below. 
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4.4.2  Other  One-Dimensional  Examples 

The  remaining  one-dimensional  problems  are  similar  to  Example  4.1.  Plots  of  v 
for  each  problem,  as  well  as  comparisons  to  Example  4.1  follow  the  last  problem 
definition. 

Example  4.2.  [Change  of  eigenvalue,  stable-to -unstable]  This  is  also  a  simple 
problem  corresponding  to  a  change  in  a  system  parameter,  but  here  the  fault  model  is 
unstable. 


x'q 

=  -Xq  +  V-y 

(4.14a) 

V 

=  +  Pi 

(4.14b) 

x'l 

1 

=  2^1  +  +  /^4 

(4.14c) 

y 

=  Xi-irPz- 

(4.14d) 

Example  4.3.  [Severe  change  of  eigenvalue,  sta,ble-to-stableJ  This  problem  is 
similar  to  Example  4-1,  but  the  parameter  change  is  more  severe,  indicaiing  stiff 
dynamics. 


x'o  = 

-Xq  -hV  P2 

(4.15a) 

V  = 

Xo  +  Pi 

(4.15b) 

x'l  = 

— -f”  ■n  -|-  Pi 

(4.15c) 

y  = 

Xi  +  /i3- 

(4.15d) 

Example  4.4.  [Severe  change  of  eigenvalue,  stable-to-unstable]  This  problem  is 
similar  to  Example  4-2,  but  the  unstable  m, ode’s  parameter  change  is  severe,  ind.ica.ting 
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highly  unstable  dynamics. 


Xq 

=  -a:o  P2 

(4.16a) 

y 

=  X(i-\-  Pi 

(4.16b) 

x\ 

=  30a;i  +  v  +  Pi 

(4.16c) 

y 

=  X\-\-  pz- 

(4.16d) 

Figure  4.8  gives  v  for  Examples  4.2  -4.4  for  tj  -  1.  Table  4.3  compares  the 
separability  indices  and  the  energy  of  u  for  Examples  4. 1-4.4  on  the  intervals  [0,1] 
and  [0, 10].  As  the  plots  show,  the  shape  of  v  is  similar  for  all  of  the  one-dimensional 
problems  in  the  set.  Example  4.2  admits  a  lower  energy  minimum  proper  detection 
signal  than  Example  4.1  on  the  same  interval.  This  is  as  expected  since  the  two  models 
of  Example  4.2  have  more  contrasting  dynamics  than  in  Example  4.1.  Examples  4.3 
and  4.4  exhibit  even  sharper  changes  in  dynamics,  and  the  much  lower  energy  v  for 
each  of  those  problems  on  the  shorter  interval  bears  witness  to  that  fact. 

As  the  interval  lengthens,  however,  the  differences  narrow.  On  [0, 10],  Example  4.1 
still  requires  significantly  more  energy  in  v  than  the  other  examples.  The  unstable 
mode  of  Example  4.2  and  the  large  changes  in  dynamics  of  Examples  4. 3-4. 4  still 
allow  for  easier  separation  of  the  models  of  these  problems.  The  differences  between 
the  last  three  examples,  however,  are  much  smaller  than  on  the  [0, 1]  interval.  In  fact. 
Example  4.2  now  requires  less  energy  than  the  other  two  examples.  This  indicates 
the  expected  result:  problems  with  more  distinct  models  are  not  as  sensitive  to  the 
test  period  length  as  those  with  similar  models.  Obviously,  extremely  short  intervals 
will  present  difficulties  for  all  problems,  but  problems  with  distinct  models  will  be 
less  sensitive  on  those  intervals,  i.e.,  they  will  produce  a  larger  7*  and  will  thus  be 
easier  to  separate  than  problems  with  more  similar  models. 

Figure  4.9  shows  all  of  the  v  curves  on  the  same  plot,  each  rescaled  to  a  maximum 
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Figure  4.8:  v  for  Examples  4.2  (left),  4.3  (center),  4.4  (right);  ij  =  I 


Table  4.3:  7*  and  ||?;||  for  Examples  4. 1-4.4:  tf  =  1, 10 


tf  =  l 

tj==  10 

Example 

Y 

V 

Y 

V 

4.1 

4.2 

4.3 

4.4 

0.02075 

0.03261 

0.13418 

0.14344 

48.1919 

30.6690 

7.4524 

6.9717 

0.18719 

0.69849 

0.38687 

0.41355 

5.3423 

1.4317 

2.5848 

2.4181 
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height  of  one.  It  is  interesting  to  look  at  this  combined  v  plot.  With  =  1  the  u  of 
Examples  4.1  and  4.2  are  superimposed  on  each  other,  as  are  the  v  of  Examples  4.3 
and  4.4.  A  mathematical  examination  reveals  a  maximum  difference  of  about  0.003 
between  the  v  of  Examples  4.1  and  4.2.  On  a  longer  time  interval,  the  two  v  are 
still  very  similar  but  there  is  a  more  visible  difference.  The  fact  that  the  two  v  are 
so  similar  suggests  that  for  some  classes  of  problems  one  could  use  the  same  v  for  a 
number  of  different  fault  models,  changing  only  the  gain  to  ensure  that  it  is  proper. 


Figure  4.9:  v  for  Examples  4. 1-4.4:  tj  =  1 

Figure  4.10  gives  y^{t)  and  ae{t)  for  Example  4.4  for  various  values  of  e.  As  we 
saw  in  previous  plots  of  the  parameters  of  the  separating  hyperplane,  they  are  quite 
insensitive  to  the  value  of  e.  In  fact,  for  this  problem,  the  plots  for  the  different  values 
of  e  are  superimposed  on  each  other.  The  insensitivity  of  the  parameters  to  the  value 
of  €  is  encouraging  in  that  it  allows  some  assurance  of  computing  a  valid  test  function 
for  virtually  every  possible  output  from  competing  models  of  the  problem. 

It  should  be  noted  that  each  Oe(t)  computed  by  the  algorithm  has  been  checked 
against  those  functions  to  which  it  should  be  orthogonal.  That  is,  for  the  output 
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Figure  4.10:  y^{t)  and  a^(t)  for  Example  4.4:  e  =  0.3, 0.5, 0.7, 0.9 

given  by 

y  =  CiCi{B,v)  +  C,Ci{MiiM)  +  Cie^‘%  +  Nun  (4.17) 

the  normal  to  the  separating  hyperplane  should  be  orthogonal  to  The  inner 

products  of  the  computed  normal  and  were  within  10~®  of  zero  for  each  value 

of  e  in  every  example. 

4.5  Two-Dimensional  State  Examples 

We  continue  our  analysis  by  examining  seven  two-dimensional  examples.  Six  of  these 
examples  exhibit  properties  that  help  to  shed  light  on  the  shape  and  energy  of  the 
minimum  energy  detection  signal,  v,  as  well  as  the  shapes  of  the  normal  to  the 
separating  hyperplane,  ae(<),  the  midpoint  of  the  shortest  line  segment  between  the 
output  sets,  y^{t),  and  the  separability  index,  7*,  for  two-dimensional  problems.  The 
first  of  these  six  was  run  on  seven  different  interval  lengths  in  order  to  examine  the 
oscillatory  properties  of  v.  The  seventh  example  is  constructed  to  be  unobservable  in 
order  to  test  the  algorithm’s  capability  to  handle  such  problems. 
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4.5.1  Primary  Two-Dimensional  Example 

In  Example  4.5  we  find  a  problem  for  which  the  shape  of  v  varies  considerably  with 
the  interval  length. 

Example  4.5.  [Change  of  eigenvalues,  neutral  stability]  This  problem  has  purely 
imaginary  eigenvalues.  The  only  difference  between  models  is  a  change  in  eigenvalues 
from  ±3i  to  ±2i. 


x'q 

—  Zq  +  p,3 

(4.18a) 

4 

=  -9Xo  +  v  +  p,2 

(4.18b) 

y 

=  Xo  +  Pi 

(4.18c) 

x\ 

—  Zi  +  pg 

(4.18d) 

A 

=  -4xi  +  V  +  P5 

(4.18e) 

y 

—  Xi  Pi- 

(4.18f) 

Figure  4.11  shows  v  for  tf  =  1,20.  Figure  4.12  shows  v  for  tf  =  1,2,4,6,8,10,20. 
In  the  tf  =  1  case,  v  resembles  the  detection  signal  in  the  one-dimensional  cases 
studied,  but  at  longer  intervals  we  get  a  very  different  v.  Oscillations  are  introduced, 
and  the  number  and  period  of  the  oscillations  depend  on  the  interval  length.  It  should 
be  noted  that  the  plots  in  Figure  4.12  are  actually  7*F,  i.e.,  they  were  computed 
in  formulation  InR  but  not  rescaled  by  the  reciprocal  of  the  separability  index  as 
required  in  that  formulation  (see  Chapter  2).  Not  performing  the  rescaling  allows 
for  a  meaningful  comparison  of  the  various  shapes  of  v,  but  does  not  allow  for  a 
comparison  of  ||t;l|  between  intervals.  If  the  plots  were  rescaled  by  the  tf  —  1,2 
cases  would  dwarf  all  others  in  magnitude,  and  the  curve  shape  comparison  would  be 
impossible. 

Table  4.4  shows  Hull  and  other  quantities  of  interest  for  Example  4.5  on  each 
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interval  tested.  We  used  an  initial  mesh  size  of  50  for  all  but  the  tf  —  10  case. 
That  case  required  an  initial  mesh  of  20  in  order  to  attain  feasibility.  This  resulted 
in  more  iterations  before  convergence,  but  still  gave  CPU  time  comparable  to  the 
first  four  cases.  As  always,  CPU  times  should  only  be  used  as  rough  indicators  since 
they  often  vary  greatly.  As  an  example,  note  that  the  CPU  time  for  the  tf  =  8 
case  is  an  unexplainable  outlier.  For  the  tf  =  12,14,16,18  cases  we  used  default 
tolerances  which  are  looser  than  the  tolerances  used  in  all  other  cases.  This  was  done 
to  demonstrate  the  efficiency  of  the  software  under  default  conditions.  Note  the  CPU 
times  for  these  cases  are  considerably  lower  than  the  other  cases  and  the  number  of 
iterations  have  not  increased.  Aside  from  the  difference  in  tolerances  and  the  natural 
variation  often  present,  the  presence  of  inconsistent  CPU  times  may  indicate  the  need 
to  optimize  either  the  conditioning  on  the  objective  function  or  the  initial  mesh  size 
for  each  interval. 

Also  note  the  steady  decrease  of  jd  as  the  interval  lengthens.  This  effect  is  probably 
due  to  the  difference  in  dynamics  between  models  becoming  more  significant  as  the 
detection  interval  lengthens. 

Finally,  notice  that  7*  is  extremely  small  on  the  shorter  intervals.  As  the  interval 
lengthens,  however,  7*  grows  to  be  larger  than  that  for  the  one-dimensional  Example 
4.1.  Thus,  Example  4.5  requires  a  higher  energy  v  than  Example  4.1  on  the  shorter 
intervals,  but  a  lower  energy  v  on  longer  intervals.  Figure  4.13  depicts  the  relationship 
of  7*  to  the  interval  length  for  this  example,  and  a  comparison  to  7*  from  Example 
4.1. 

Figure  4.14  gives  y^t)  and  a^{t)  for  Example  4.5,  for  various  values  of  e,  and 
for  a  typical  interval  length,  tf  —  6.  Contrary  to  what  we  saw  in  previous  plots 
of  the  parameters  of  the  separating  hyperplane,  these  plots  depict  some  sensitivity 
to  the  value  of  e.  This  sensitivity  indicates  that  one  should  not  blindly  choose  a 
value  of  e  for  a  given  problem  without  first  determining  whether  the  value  of  e  is 
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Table  4.4:  Performance  comparison  of  Example  4.5  on  various  time  intervals 


if 

7* 

/5 

CPU  Time 

Iterations 

1 

0.72694*  10-^ 

0.500104 

850.94 

3 

2 

0.14710*  10"^ 

0.501659 

388.05 

3 

4 

0.91764*  10-1 

0.502983 

867.55 

5 

6 

0.18093 

0.463663 

960.15 

4 

8 

0.23532 

0.469403 

4580.01 

5 

10 

0.27321 

0.404642 

836.74 

9 

12 

0.29050 

0.351149 

264.08 

5 

14 

0.30296 

0.315157 

267.95 

5 

16 

0.31787 

0.293182 

259.66 

5 

18 

0.32954 

0.262228 

330.09 

5 

20 

0.33806 

0.240156 

9464.59 

6 

Figure  4.13:  7*  for  Example  4.5  as  a  function  of  tj  (left), 
compared  with  Example  4.1  (right) 
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a  factor.  For  this  problem,  and  most  likely  many  others,  the  value  of  e  affects  the 
accuracy  of  the  approximation  to  the  true  separating  hyperplane,  and  thus  should  be 
chosen  to  minimize  the  error  while  ensuring  a  positive  separation  between  output  sets. 
Fortunately,  tests  thus  far  indicate  that  the  calculations  are  reasonably  robust  and 
parameters  are  usually  not  highly  sensitive  to  the  value  of  e.  In  addition,  ae(t)  varies 
smoothly  with  e,  and  thus  one  can  guarantee  high  quality  results  by  experimenting 
with  e  values.  Ultimately,  if  perfect  model  identification  is  required,  one  should  apply 
Sv  with  5  >  1  and  use  e  =  1  in  the  MI  algorithm.  Additional  energy  in  u  is  a  small 
price  to  pay  to  guarantee  that  a(t)  is  accurate  to  within  machine  precision. 


Figure  4.14:  y^(t)  and  ae(t)  for  Example  4.5:  e  =  0.3, 0.5, 0.7, 0.9 


4.5.2  Other  Two-Dimensional  Examples 

The  remaining  two-dimensional  examples,  while  exhibiting  various  dynamical  and 
weighting  properties,  all  possess  similar  v  and  Oe(t)  qualities.  Thus,  they  will  be 
treated  together. 
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Example  4.6.  [Change  of  eigenvalues,  neutral  stability]  This  problem  has  the 
same  change  in  eigenvalues  from  ±3i  to  ±2z  as  Example  f.5.  It  also  has  different 
weighting  on  the  noise  matrix,  Mq. 


x'q 

—  ^0  +  5/t3 

(4.19a) 

^0 

=  -9xo  +  v  +  4//2 

(4.19b) 

y 

=  Xo  +  Pi 

(4.19c) 

x[ 

=  Zi  +  Pg 

(4.19d) 

=  -Axi  -\-v  -y  Pg 

(4.19e) 

y 

=  .Ti+//.^. 

(4.19f) 

Example  4.7.  [Change  of  eigenvalues,  neutra,l-to-unsta,ble[  This  problem  has 
eigenvalues  which  change  from  purely  imaginary,  ±3i,  to  complex  unstable,  ^  ±  3i. 


x'q 

—  >^0+^3 

(4.20a) 

-^0 

=  —9Xq  +  +  /i2 

(4.20b) 

y 

=  Xq  +  [J^i 

(4.20c) 

x\ 

=  Zi  +  fie 

(4.20d) 

A 

=  — Q.Olxi  +  +  fJ^e 

5 

(4.20e) 

y 

=  Xi  +  ^4* 

(4.20f) 
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Example  4.8.  [Change  of  eigenvalues,  neutral-to-stable]  This  problem  has  eigen¬ 
values  which  change  from  purely  imaginary,  ±3i,  to  complex  stable,  — ±  3i. 


x'q 

—  Zq  +  pz 

(4.21a) 

^0 

—  ~9xo  -\-  V  P2 

(4.21b) 

y 

=  Pi 

(4.21c) 

x'l 

=  Zi-\-  Pz 

(4.21d) 

1 

=  -9.01xi  - -zi+v  +  ps 

5 

(4.21e) 

y 

=  Xi  Pi. 

(4.21f) 

Example  4.9.  [No  change  of  eigenvalues,  neutral  stability]  This  problem  has  no 
change  in  dynamics.  Both  models  have  eigenvalues  of  ±3i.  The  difference  between 
models  is  a  change  in  weighting  of  the  noise  matrix,  Mi,  and  the  output  matrix,  Ci. 


Xq  —  Zq  4-  pz  (4.22a) 

z'q  =  -9xo  p2  (4.22b) 

y  ^  xo-f  Pi  (4.22c) 

x'l  =  Zi  +  2pe  (4.22d) 

=  — 9xi  +  ■y  +  3/^5  (4.22e) 

y  =  5xi  +  Pi.  (4.22f) 
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Example  4.10.  [Change  of  eigenvalues,  stable-to-stable]  This  problem  has  eigen¬ 
values  which  change  from  complex  stable,  —1  ±3i,  to  complex  stable,  —2  ±  2i. 


Xq 

—  -~^0  d-  Zo  +  Ps 

(4.23a) 

^0 

=  —9Xq  —  Zq  d-  V  P2 

(4.23b) 

y 

=  Xo  +  Pi 

(4.23c) 

x\ 

=  —2xi  Zi  Pa 

(4.23d) 

A 

=  -4.Ti  -  2zi  V  Pa 

(4.23e) 

y 

=  Xid-PA- 

(4.23f) 

On  the  left  side  of  Figure  4.15  is  v  on  the  interval  [0,1]  for  Examples  4.5-4.10. 
Some  scaling  problems  exist  among  the  examples  on  this  interval,  so  this  plot  is  left 
unrescaled  as  before.  Note  that,  for  the  first  time  thus  far,  we  have  some  oscillations 
in  V  on  our  shortest  interval.  This  indicates  that,  for  those  problems  which  exhibit 
the  oscillation,  [0, 1]  is  a  long  interval  relative  to  their  dynamics.  This  result  is  not 
surprising,  as  we  would  expect  the  dynamics  of  the  problem  to  influence  the  affect  of 
a  given  test  period  length  on  the  detection  signal. 

The  right  side  of  Figure  4.15  shows  the  same  information  on  the  interval  [0, 10]. 
Here,  the  scaling  problem  is  not  as  severe,  so  each  v  has  been  rescaled  by  Note 
that  one  problem,  Example  4.9,  exhibits  a  single-lobed  v,  in  contrast  to  all  other  two- 
dimensional  problems  so  far  on  this  interval.  Recall  that  this  example  has  no  change 
in  dynamics  between  models,  but  only  various  different  weight  matrices.  Thus  we 
should  expect  for  this  type  of  problem  that  v  merely  has  to  contain  enough  energy  to 
drive  the  output  sets  apart  without  requiring  any  complex  wave  form,  as  in  the  one¬ 
dimensional  problems  examined  previously.  Figure  4.16  gives  all  examples  exhibiting 
a  simple  v  for  the  case  tf  =  10.  These  plots  are  all  scaled  properly,  and  the  highest 
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energy  plot  is  that  of  Example  4.9. 

As  we  mentioned  in  the  previous  section,  [29]  observed  that  the  v  seem  to  lie  in 
a  region  bounded  by  half  of  a  period  of  the  sine  function.  Figure  4.17  shows  this 
by  plotting  the  same  information  as  Figure  4.15  (right),  with  each  curve  normalized 
in  energy.  It  is  quite  apparent  that  all  oscillations  of  the  various  v  do  occur  in  an 
envelope  that  resembles  the  sine  function. 


Figure  4.15:  v  for  Examples  4.5-4.10:  tf  =  1  (left),  tf  =  10  (right) 


Table  4.5  compares  various  quantities  from  Examples  4. 2-4. 4  and  4.5-4.10  for 
tf  =  I,  while  Table  4.6  does  the  same  for  tf  =  10.  Again,  an  initial  mesh  of  50 
was  used  for  all  examples,  except  Examples  4.2,  4.5,  4.9,  and  4.10  on  [0, 10]  used  an 
initial  grid  of  20  for  the  feasibility  reasons  mentioned  above.  It  is  clear  from  Table 
4.5  that  the  one-dimensional  examples,  along  with  the  two-dimensional  example  with 
unchanging  dynamics,  possess  higher  separability  indices  than  the  two-dimensional 
examples  with  changing  dynamics,  and  thus  admit  a  lower  energy  v.  This  difference 
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is  somewhat  clouded  as  the  interval  lengthens  to  10,  indicating  that,  on  longer  inter¬ 
vals,  the  energy  required  in  v  becomes  more  dependent  on  system  dynamics  than  on 
problem  dimension  or  simplicity. 


Table  4.5;  Performance  comparison  of  Examples  4.2-4.10:  tf  =  1 


Example 

7* 

CPU  Time 

Iterations 

4.2 

0.32606  *  10-1 

0.499823 

38.69 

2 

4.3 

0.13418 

0.509624 

310.90 

4 

4.4 

0.14344 

0.509624 

234.53 

4 

4.5 

0.72694  *  10-3 

0.500104 

850.94 

3 

4.6 

0.69719*  10-3 

0.520423 

829.42 

3 

4.7 

0.13485  *  10-3 

0.499998 

1535.48 

3 

4.8 

0.13485*  10-3 

0.499998 

533.84 

3 

4.9 

0.64642  *  10-1 

0.314308 

795.85 

3 

4.10 

0.12853*  10-2 

0.500149 

870.98 

3 

Table  4.6:  Performance  comparison  of  Examples  4.2-4.10:  tf  =  10 


Example 

7* 

/5 

CPU  Time 

Iterations 

4.2 

0.69849 

0.418685 

31.37 

4 

4.3 

0.38687 

0.579995 

838.09 

7 

4.4 

0.41355 

0.579995 

1281.74 

8 

4.5 

0.27321 

0.404642 

836.74 

9 

4.6 

0.17170 

0.529625 

8490.14 

5 

4.7 

0.29614*  10-1 

0.504179 

7165.95 

8 

4.8 

0.29614*  10-1 

0.504179 

2657.45 

5 

4.9 

0.14981 

0.335714 

2971.26 

6 

4.10 

0.38458  *  10-1 

0.514021 

469.30 

7 

Figure  4.18  gives  a  typical  y^{t)  and  ae{t)  for  these  examples. 
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Figure  4.18:  y^(t)  and  ac(t)  for  Example  4.10:  e  =  0.7 


4.5.3  Common  Mode  Two-Dimensional  Example 


For  our  last  two-dimensional  example  we  present  a  problem  in  which  the  two  models 
share  a  common  eigenvalue,  eigenvector  pair. 

Example  4.11.  [Shared  eigenvalue,  unstahle-to-unstablej  This  problem,  has  a 
change  in  its  first  eigenvalue  from  2  to  3.  The  other  eigenvalue  does  not  cha,nge 
between  models.  It  rema.ins  at  1. 


,  I"  1  0 1  [21 

Xq  =  Xo  +  V  + 

[  0  2  J  [12 

1  0  1  r  0  1  0  0 

3^0  + 

01  [1000 

,  f  1  0  1  f  2  1 

Xi  =  Xi+  u  -F 

I  0  3  J  [12 

1  0  1  r  0  1  0  0 

Xl  -|- 

0  1 J  [1000 


0  0  0  1 
0  0  10 


Mo 


Mo 


0  0  0  1 
0  0  10 


Ml 


Ml- 


(4.24a) 

(4.24b) 

(4.24c) 

(4.24d) 


Both  models  of  this  example  are  controllable  and  observable.  To  see  this  fact, 
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recall  from  Chapter  1  that  a  second-order  system  is  controllable  if  and  only  if  [s/  — 
A  I  B]  has  rank  2  for  all  values  of  s.  The  same  system  is  observable  if  and  only  if 
[sI-A^  I  C^]  has  rank  2  for  all  values  of  s  [6].  Clearly,  both  models  of  Example  4.11 
are  controllable  since  Bq  and  Bi  are  full  rank  by  themselves.  Both  models  are  also 
observable  since  the  same  holds  for  Cq  and  Cf .  However,  the  fact  that  there  exist  a;o 
and  Xi  subspaces  on  which  the  dynamics  and  the  outputs  are  the  same  is  equivalent 
to  saying  that  the  combined  system  is  not  observable.  These  subspaces  exist  because 


is  a  solution  of  the  free  response  for  both  of  them.  Thus  each  output  set  has  a  parallel 
side  in  the  long  direction. 

In  terms  of  the  FDMI  algorithm,  as  it  attempts  to  separate  the  output  sets  in  the 
detection  signal  phase,  the  last  points  to  touch  will  be  those  on  the  parallel  side.  Many 
states  and  outputs  will  be  eligible  to  be  part  of  the  optimal  solution,  and  because  of 
this  nonuniqueness,  the  algorithm  may  be  unable  to  choose  one.  Simply  stated,  the 
two  models  may  not  be  different  enough  for  the  algorithm  to  find  a  unique  minimal  v. 
Even  if  it  is  possible  to  choose  an  optimal  solution  from  the  equivalent  candidates  in 
the  detection  signal  phase,  the  model  identification  phase  also  separates  the  output 
sets.  As  the  algorithm  searches  for  the  closest  points  on  the  closure  of  each  set,  it  will 
again  find  many  eligible  points  on  each  parallel  side,  and  may  be  unable  to  choose. 

In  fact,  this  problem  does  occur  in  Example  4.11.  As  mesh  refinements  are  made 
and  the  dynamics  are  resolved  to  near-tolerance  accuracy,  the  algorithm  terminates 
with  a  warning  about  degenerate  constraints  and  SQP  errors.  (When  the  problem  is 
modified  to  remove  the  parallel  sides,  all  errors  disappear  and  the  problem  converges 
to  any  desired  tolerance.)  Despite  the  errors,  the  dynamics  are  resolved  quite  accu¬ 
rately,  down  to  the  10“®  level.  In  addition,  the  algorithm  is  converging  to  a  solution 
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before  the  errors  occur.  So  it  appears  that  the  algorithm  is  able  to  handle  this  type 
of  problem  despite  the  lack  of  uniqueness  of  the  optimal  solution.  Figures  4.19  and 
4.20  give  V,  y^t),  and  ai{t)  for  this  problem  on  the  interval  [0, 5]. 


If  the  FDMI  algorithm  is  not  able  to  solve  a  given  problem  with  common  eigen¬ 
value,  eigenvector  pairs  between  models,  we  have  the  option  of  truncating  the  outputs. 
That  is,  we  multiply  outputs  from  both  models  by  the  same  matrix,  annihilating  the 
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identical  part,  and  solve  the  resulting  lower  dimensional  problem  as  before.  Testing 
of  this  procedure  will  be  left  to  future  research. 


4.6  Industrial  Example 

We  now  turn  to  a  real  world  example  in  order  to  test  the  performance  of  the  FDMI 
algorithm  on  larger  scale  problems.  The  problem  is  the  equalized  and  linearized  model 
of  a  single-engine  F-16  aircraft  [39].  The  model  has  a  three-dimensional  state,  and 
the  dynamics  include  a  reference  control  input. 

Example  4.12.  [Change  of  eigenvalues,  stable-to-unstable]  The  nominal  model 
of  this  problem  has  one  stable  real  eigenvalue,  and  two  stable  complex  eigenvalues. 
The  fault  model  has  three  unstable  real  eigenvalues. 
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0  0.9971  0.0755 
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(4.26b) 


A  fault  model  simulating  an  electrical  interruption  to  a  flight  control  computer’s  input 
channels  may  be  represented  as 
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The  three  states  are  side-slip,  roll  rate,  and  yaiu  rate,  and  the  control  vector  is 


(rudder  input 
stick  input 


Assuming  that  the  detection  signal  is  applied  on  the  same  channels  as  the  control 
vector  during  the  test  period,  one  of  the  following  occurs: 

•  the  control  is  nulled  so  that  only  the  detection  signal  is  inputted, 

•  the  control  remains.  It  is  subtracted  from  v  from  the  MEDS  algorithm 
to  obtain  the  additional  signal  required  to  separate  the  output  sets  of  the 
two  models. 

The  second  option  is  the  equivalent  of  solving  one  of  the  alternative  problems  de¬ 
scribed  in  Chapter  2,  i.e., 

min||u||  subject  to  jnax^  J„,+„(/3)  >  1.  (4-27) 

If  the  detection  signal  must  be  kept  off  of  the  control  channels,  then  the  other  problem 
described  in  Chapter  2  is  solved,  i.e., 

min||u;i||  subject  to  max  J,c{(d)  >  1  and  W2  =  u.  (4.28) 

0</3<l 

where  the  dynamics  are  described  by 

x\  =  AiXi  -f  BiV  -f  EiU  +  MiPi-  (4.29) 

and  where  w  =  [wi,W2]  =  [v,v].  Obviously,  if  the  nulling  option  is  chosen,  the  test 
period  should  be  kept  as  short  as  possible  to  avoid  aircraft  control  difficulties.  The 
results  below  reflect  the  nulling  option. 

SOCS  has  no  difficulty  solving  this  problem.  The  FORTRAN  code  generated  is 
lengthy,  but  optimized  coding  methods  can  minimize  the  impact  of  this  effect.  In 
fact,  standard  routines  for  translating  matrix  multiplication  into  FORTRAN  loops 
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can  reduce  the  effect  of  the  added  dimensionality  to  virtually  zero.  Figures  4.21  and 
4.22  give  v,  and  a^{t)  for  this  problem  on  the  interval  [0, 1]. 


Figure  4.21:  Components  of  n  for  Example  4.12:  tf  =  1 


Figure  4.22:  Components  of  y^{t)  and  Oe(t)  for  Example  4.12:  e  =  0.7 

These  results  demonstrate  that  the  FDMI  algorithm  is  capable  of  handling  higher 
dimensional  problems  and  is  limited  only  by  the  multi-model  assumptions  inherent 
in  the  approach. 
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4.7  Multiple  Fault  Model  Examples 

To  conclude  the  analysis  of  examples,  we  examine  two  systems,  each  of  which  has  a 
nominal  model  and  two  fault  models.  One  of  these  systems  is  one-dimensional  and 
the  other  is  two-dimensional. 

Recall  from  Chapter  2  the  two  approaches  described  to  handle  multiple  fault 
model  problems.  The  first  method  is  sequential  in  nature,  in  which  the  test  period  is 
divided  into  equal  segments,  and  problems  involving  each  pair  of  models  are  solved  on 
the  shorter  intervals.  It  was  suggested  that  this  method  would  produce  a  v  that  was 
larger  than  necessary.  The  second  method  should  produce  a  u  of  much  lower  energy. 
It  is  simultaneous  in  nature,  in  which  the  pairwise  problems  are  solved  independently 
on  the  full  interval  for  a  common  v. 

Our  last  two  examples  demonstrate  the  ability  of  the  FDMI  algorithm  to  handle 
multiple  fault  model  systems  as  well  as  shed  light  on  the  conjecture  of  Chapter  2 
about  the  energy  of  the  v  resulting  from  each  method. 

4.7.1  One-Dimensional  Example 

The  simplest  multiple  fault  model  problem  is  given  in  Example  4.13. 

Example  4.13.  [Change  of  eigenvalue,  stable/ sta, hie/ stable]  This  problem  has  a 
stable  nominal  model  and  two  stable  fa,ult  m,odels.  Eigenvalues  are  —1,  —3,  and  —0.2, 
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respectively. 


model  0: 

x'q  =  -Xo  +  V  +  P2 

(4.30a) 

y  =  Xo  +  p-i 

(4.30b) 

model  1: 

x[  =  — 3x1  +  V  +  Pi 

(4.30c) 

y  =  xi  +  p3 

(4.30d) 

model  2: 

X2  =  — 0.2x2  V  Pq 

(4.30e) 

y  =  X2+P5- 

(4.30f) 

Figure  4.23  gives  v  for  the  two  methods.  On  the  left  is  the  comparison  of  v  using 
the  sequential  method  on  [0,1],  [1,2],  and  [2,3]  to  that  of  the  simultaneous  method 
on  [0,3].  The  plots  are  in  the  same  scale,  and  it  is  obvious  that  the  simultaneous 
method  produces  a  u  of  much  lower  energy.  On  the  right  side  of  Figure  4.23  is  a 
comparison  of  v  from  the  simultaneous  method  to  each  v  obtained  by  solving  the 
pairwise  problems  completely  independently  on  the  entire  interval  [0,3].  For  this 
problem,  the  simultaneous  v  is  the  same  as  the  highest  energy  v  from  the  independent 
pairwise  problems.  Thus,  we  can  handle  multiple  fault  models  for  the  same  energy 
and  with  the  same  v  wave  form  as  required  for  one  fault  model.  Often,  however,  the 
simultaneous  v  will  require  more  energy  and  will  not  be  the  same  shape  as  the  any 
of  the  three  independent  v.  However,  the  similar  shapes  of  the  v  for  this  problem 
suggest  that  if  feasibility  issues  arise  in  the  simultaneous  method,  the  v  from  the 
pairwise  problems,  or  multiples  thereof,  can  be  applied  as  an  accurate  initial  guess 
for  the  combined  problem. 

Figure  4.24  gives  the  y^{t)  and  ae(t)  for  the  sequential  method,  and  Figure  4.25 
gives  the  same  for  the  simultaneous  method.  The  only  immediate  advantages  which 
can  be  gleaned  for  the  simultaneous  method  from  these  plots  is  the  continuity  of  y  and 
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a  over  the  entire  interval,  and  the  lack  of  a  requirement  to  maintain  three  different  test 
functions.  Since  y  and  a  must  be  stored,  if  a  number  of  comparisons  between  multi¬ 
dimensional  models  must  be  made,  the  reduced  number  of  test  functions  demonstrates 
better  use  of  possibly  limited  memory  resources. 


Figure  4.23:  v  for  Example  4.13  sequential  vs.  simultaneous  (left), 
full  interval  two-model  vs.  simultaneous  (right) 


Figure  4.24:  y^{t)  and  ae{t)  for  Example  4.13:  sequential  solve 
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Figure  4.25:  y^it)  and  ae(t)  for  Example  4.13:  simultaneous  solve 


4.7.2  Two-Dimensional  Example 

Our  final  example  exhibits  several  interesting  stability  properties. 

Example  4.14.  [Change  of  eigenvalues,  stable/stable/unstable]  This  problem  has 
a  stable  nominal  model,  one  stable  fault  model,  and  one  unstable  fault  model.  Nominal 
model  eigenvalues  are  —2  ±  i.  Fault  model  1  eigenvalues  are  —9.7016  and  —3.2984. 
Fault  model  2  eigenvalues  are  |  ±  1.3229i. 
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Figure  4.26  gives  v  for  the  two  methods.  On  the  left  is  the  comparison  of  v 
using  the  sequential  method  to  that  of  the  simultaneous  method.  The  plot  oscillating 
minimally  around  the  horizontal  axis  is  that  of  the  simultaneous  method.  Again  it 
is  obvious  that  the  simultaneous  method  produces  a  F  of  much  lower  energy.  On  the 
right  side  of  Figure  4.23  is  a  comparison  of  v  from  the  simultaneous  method  (dotted 
line)  to  the  v  obtained  by  solving  the  pairwise  problems  completely  independently 
on  the  entire  interval  (solid  lines).  The  simultaneous  v  is  not  the  same  as  any  other 
V  from  the  independent  pairwise  problems.  In  fact,  the  simultaneous  v  contains  an 
extra  extrema  on  the  interval.  Even  so,  the  plot  shows  that  we  can  handle  multiple 
fault  models  with  only  a  bit  more  energy  than  required  for  a  single  fault  model. 

Figure  4.27  gives  the  y^{f)  and  af(t)  for  the  sequential  method,  and  Figure  4.28 
gives  the  same  for  the  simultaneous  method.  Again,  the  only  advantages  appear  to 
be  the  continuity  of  y  and  a,  and  simplicity  of  the  test  function  for  the  simultaneous 
method. 

4.8  Conclusion 

The  purpose  of  this  chapter  was  to  give  examples  and  analysis  of  the  practical  aspects 
of  the  FDMI  algorithm.  After  restating  the  problem  and  one  form  of  the  algorithm, 
we  examined  the  simplest  problems  in  one  dimension  to  ascertain  shapes  and  ener¬ 
gies  of  the  minimum  energy  detection  signals  for  those  problems.  Comparisons  of  one 
problem  in  four  different  formulations  using  different  initial  guesses  showed  that  each 
arrived  at  the  same  optimal  solution,  supporting  uniqueness  assertions  made  in  Chap¬ 
ter  3.  We  then  looked  at  more  complex  examples  in  two  dimensions  to  demonstrate 
how  the  basic  shapes  and  energies  of  v  evolve.  An  unobservable  system  was  tested, 
and  despite  a  known  lack  of  uniqueness  of  the  closest  points  in  the  output  sets,  the 
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Figure  4.26:  v  for  Example  4.14  sequential  vs.  simultaneous  (left), 
full  interval  two-model  vs.  simultaneous  (right) 


Figure  4.27:  yg(t)  and  ag(t)  for  Example  4.14:  sequential 
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Figure  4.28:  y^(t)  and  a((t)  for  Example  4.14:  simultaneous 


FDMI  algorithm  and  SOCS  provided  a  solution.  An  industrial  strength  example  in 
three  dimensions  proved  that  the  algorithm  and  SOCS  were  capable  and  efficient  even 
in  the  higher  dimensions.  We  ended  with  examples  of  the  extension  of  the  algorithm 
to  problems  with  multiple  fault  models. 

Throughout  the  examples  we  saw  varjdng  degrees  of  sensitivity  to  the  value  of  e. 
It  was  demonstrated  that  some  care  must  be  taken  in  choosing  the  value  of  e  used  in 
the  algorithm.  However,  it  was  also  shown  that  high  quality  results  can  be  reliably 
computed  with  the  proper  selection  of  e.  In  addition,  we  saw  some  difficulties  arising 
due  to  very  small  objective  function  values.  A  method  of  automatic  conditioning 
on  objective  functions  should  be  developed  to  overcome  these  difficulties.  Finally, 
systems  for  which  a  feasible  initial  guess  is  difficult  to  obtain  due  to  initial  mesh 
size  considerations  may  require  pre-conditioning  to  facilitate  making  the  initial  guess. 
This  problem  may  also  arise  due  to  the  fact  that  —v  and  v  are  both  optimal  solutions. 
An  initial  guess  of  u  =  0  may  be  suitable  for  some  problems,  but  it  is  a.  saddle  point. 
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and  using  it  could  slow  down  convergence  of  the  SOCS  program.  Despite  these  issues, 
the  analysis  in  this  chapter  demonstrates  the  efficiency  and  practicality  of  the  FDMI 
algorithm. 


Chapter  5 


Future  Work  and  Conclusions 

5.1  Future  Work 

Up  to  this  point,  we  have  exercised  the  FDMI  algorithm  mostlj^  within  the  confines  of 
certain  assumptions  about  the  structure  of  the  problem  to  be  solved.  For  example,  we 
have  assumed  a  short  test  period  length,  no  a  priori  knowledge  of  initial  conditions  on 
the  state,  and  strict  linearity  in  the  dynamics.  While  we  have  extended  the  algorithm 
to  problems  with  multiple  fault  models,  with  a  pre-existing  known  control,  and  with 
alternative  cost  functions,  we  have  not  mentioned  other  variations  that  may  help  to 
extend  the  algorithm  to  an  even  larger  set  of  problems.  Some  variations  we  have 
mentioned,  but  have  left  to  future  research.  The  unreduced  model,  mentioned  in 
Chapter  2,  would  be  of  direct  benefit  to  several  types  of  problems  excluded  from  our 
study.  Also,  unobservable  systems,  theoretically  excluded  from  the  FDMI  algorithm, 
may  in  fact  be  treatable.  While  we  discussed  the  possibility  of  applying  the  FDMI 
algorithm  to  such  systems,  and  even  worked  an  example  for  which  the  algorithm 
provided  a  solution,  the  bulk  of  the  work  remains.  In  particular,  the  most  promising 
aspect,  that  of  projecting  each  model  onto  a  subspace  to  eliminate  the  parallel  sides, 
is  left  undone. 
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This  chapter  will  introduce  and  discuss  several  other  interesting  and  applicable 
variations  to  the  basic  problem  and  the  impact  they  have  on  the  FDMI  algorithm. 
Topics  mentioned  below  are:  the  half-infinite  interval,  linear  time  varying  problems, 
nonlinear  problems,  independent  noise  channels,  and  sensitivity  issues.  We  mention 
these  topics  for  the  purpose  of  highlighting  the  work  still  remaining  in  this  area,  and 
thus  will  not  attempt  a  complete  and  detailed  look  at  each  one.  Conclusions  follow 
the  last  topic  and  complete  our  discussion. 

5.1.1  The  Half-Infinite  Interval 

We  mentioned  in  Chapter  1  that  Nikoukhah  et  al.  [29]  was  the  inspiration  for  the 
work  contained  in  this  thesis.  In  that  paper,  theory  is  developed  for  the  limiting 
shape  and  energy  of  v  on  the  interval  [0,  oo).  While  in  practice  the  use  of  the  half¬ 
infinite  detection  horizon  is  not  always  possible  due  to  unstable  fault  models  or  cost 
considerations,  theoretical  development  of  the  limiting  case  can  aid  in  approximating 
detection  signals  and  separability  indices  on  long  intervals.  As  this  thesis  is  dedicated 
more  to  practical  applications  of  online  detection,  we  leave  the  development  of  the 
limiting  case  to  future  research. 

To  aid  in  that  research,  we  note  that  Riccati  matrix  differential  equations  provide 
convenient  properties  on,  and  useful  insights  into  the  half-infinite  interval.  Finite  in¬ 
terval  Riccati  differential  equations  become  algebraic  equations  as  tf  goes  to  infinity. 
In  fact,  a  large  part  of  the  Riccati  form  of  the  FDMI  algorithm  should  be  directly 
extendable  to  the  half-infinite  interval.  The  translation  of  other  parts  and  confirma¬ 
tion  of  the  results  remain  to  be  accomplished.  In  particular,  questions  remain  about 
the  multiple  fault  model  case,  and  whether  the  FDMI  algorithm  will  be  capable  of 
addressing  those  problems  on  the  half-infinite  interval. 
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5.1.2  Linear  Time  Varying  Models 

The  linear  time  invariant  problem  addressed  thus  far  is  a  special  case  of  a  larger  set  of 
control  problems,  the  linear  time  varying  (LTV)  problem.  In  the  LTV  case,  linearity 
still  exists,  but  the  coefficient  matrices  A,  B,  C,  M,  and  N  depend  on  time.  Thus 
the  multi-model  system  model  4.1  becomes 

x'i  =  Ai{t)xi  + Bi{t)v  +  Mi{t)p.i  (5.1a) 

y  =  Ci{t)xi  +  Ni{t)pi  (5.1b) 

for  i  =  0,...,m.  Luckily,  convexity  still  exists  due  to  linearity,  so  the  same  visu¬ 
alizations  and  characterizations  of  system  dynamics  and  outputs  can  be  made.  In 
fact,  even  much  of  the  Riccati  theory  mentioned  in  the  previous  section  holds.  Un¬ 
fortunately,  the  infinite  interval  algebraic  Riccati  equation  does  not  hold,  so  further 
research  not  based  on  our  Riccati  approach  will  be  required  for  that  case. 

We  should  also  note  that  the  system  reductions  described  in  Chapter  2  will  now 
involve  time  varying  matrices,  so  software  must  be  chosen  which  can  handle  the  new 
complexities  involved.  In  addition,  technical  difficulties  arise  due  to  time-varying 
coordinate  changes,  x  =  Q{t)w.  Q'  will  enter  the  equations,  and  differentiation  of 
computed  quantities  is  highly  undesirable.  Also,  even  if  Ni  is  full  row  rank  there  still 
may  not  exist  a  submatrix  which  is  invertible  for  all  t.  These  difficulties  highlight 
the  need  to  develop  a  theory  and  algorithms  that  work  on  systems  in  their  original, 
unreduced  form.  It  is  apparent  that  much  adaptation  of  theory,  further  development 
of  algorithms,  and  tests  of  new  examples  are  required  before  the  LTV  subject  can  be 
considered  closed. 
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5.1.3  Nonlinear  Models 

Another  larger  set  of  control  problems  of  which  the  LTI  problem  is  a  special  case  is 
the  nonlinear  problem.  The  semi-explicit  nonlinear  control  problem  can  be  written 
as 


x'  =  f[x{t),u{t),t]  (5.2a) 

y  =  9[x{t),u{t),t].  (5.2b) 

While  the  FDMI  algorithm  should,  in  principle,  be  extendable  to  these  types  of 
systems,  technical  and  computational  difficulties  arise  due  to  the  loss  of  convexity. 
As  a  result,  solutions  are  only  locally  optimal,  if  they  can  be  shown  to  exist  at  all, 
and  the  implementation  of  the  algorithm  in  optimization  codes  becomes  subject  to 
significant  initial  guess  and  convergence  issues. 

Of  the  general  class  of  nonlinear  control  problems,  three  types  show  promise  for 
more  immediate  application  of  the  FDMI  algorithm.  These  are:  small  bounded  non- 
linearities,  nonlinearities  in  only  the  control,  and  nonlinearities  involving  coefficient 
matrices  dependent  on  v.  We  discuss  each  of  them  in  turn. 

Small  Nonlinearities 

Small  norm-bounded  nonlinearities  do  not  present  undue  difficulties  to  the  FDMI 
algorithm.  Suppose  that  the  models  and  noise  bounds  are  of  the  form 

x\  =  AiXi  +  gi{xi,  t)  -h  BiV  -h  MiPi 
y  =  CiXi  -t-  NiPi 

\\f£  <  1- 


(5.3a) 

(5.3b) 

(5.3c) 


Chapter  5.  Future  Work  and  Conclusions 


133 


If  t)||  <  e,  then  we  can  address  (5.3)  by  considering 

x\  =  AiXi  +  BiV  +  M  iji.i 
y  =  CiXi  +  NiFi 

\\Fif  < 


(5.4a) 

(5.4b) 

(5.4c) 


where  p,  = 


^],Mi  =  [I  Mi],  and  Ni  =  [0  Ni],  After  rescaling,  (5.4)  is  in  the  form 


required  by  the  FDMI  algorithm.  This  formulation  will  produce  an  answer  which  over 


estimates  the  required  ||u||.  Many  robust  algorithms  can  handle  small  nonlinearities 
in  this  way.  Tests  of  the  efficiency  of  the  v  produced  via  this  formulation  are  left  to 


future  research. 


Nonlinear  in  the  Control 

Another  way  nonlinearities  may  enter  the  problem  is  through  the  control.  Recall  from 
Chapters  2  and  3  that  [30]  begins  with  our  problem  and  assumptions,  but  goes  in 
quite  a  different  direction  to  develop  the  detection  signal  and  separating  hyperplane. 
While  that  approach  seems  more  direct  and  satisfying,  it  is  not  capable  of  handling 
nonlinear  controls.  The  FDMI  algorithm  gains  a  distinct  advantage  in  this  aspect 
because  it  can  solve  problems  with  nonlinearities  in  the  control. 

Suppose  that  the  nonlinearly  controlled  models  are 

x'i  =  AiXi  +  Big{v)  +  MiPi,  (5.5a) 

y  -  CiXi  +  NiPi  (5.5b) 

where  v  may  be  a  steering  angle,  for  example,  and  thus  would  enter  the  equations  as 
cos  V.  The  2nR  form  of  the  optimization  problem  is  now 

min  ||r;||  such  that  Jg[y){P)  >  1  for  some  P  G  [0, 1]  (5.6) 

V 
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which,  after  the  reductions  described  in  Chapter  2,  becomes  a  nonlinear  boundary 
value  problem.  While  SOCS  has  excellent  nonlinear  capabilities,  the  nature  of  g{v) 
will  have  a  lot  to  say  about  the  solution  to  the  BVP.  If  the  function  is  not  one  to  one, 
as  cosu  is  not,  then  we  should  expect  multiple  local  minima  and  saddle  points.  The 
rate  of  convergence  to,  and  the  quality  of  the  solution  then  become  a  function  of  the 
initial  guess.  Development  of  an  efficient  initial  guess  algorithm  becomes  paramount, 
and  is  a  ripe  topic  for  future  research. 

Coefficient  Matrices  Dependent  on  the  Detection  Signal 

An  interesting  class  of  nonlinear  problems  involves  system  coefficient  matrices  which 
can  be  affected  by  the  detection  signal.  The  system  models  for  this  class  are 

x[  =  Ai{v)xi  + Bi{v)v  +  (5.7a) 

y  =  Ci{v)xi  +  Ni{v)pi.  (5.7b) 

For  this  problem,  for  a  given  v  the  output  sets,  A^{v),  are  still  convex.  However, 
they  vary  nonlinearly  with  v.  Given  that  one  has  a  proper  detection  signal,  the 

FDMI  algorithm  can  still  be  used  to  solve  for  the  normal  to  the  separating  hyperplane. 

Obtaining  the  minimum  energy  proper  detection  signal  may  not  be  as  straightforward. 
Since  the  minimization  of  |lu|l  occurs  on  the  outside  of  all  other  operations,  the 
algorithm  may  work  as  coded  in  SOCS.  On  the  other  hand,  this  type  of  problem  is 
similar  to  the  LTV  problem,  but  with  an  added  nonlinear  structure.  The  difficulties 
described  for  that  problem  are  applicable  here  as  well.  Reductions  involving  nonlinear 
u-varying  matrices  will  undoubtedly  be  more  complex  than  those  involving  linear 
time  varying  matrices,  and  will  require  algorithms  and  software  that  can  handle  their 
complexities.  In  addition,  if  a  u-varying  coordinate  change  is  required,  then  v'  may 
enter  the  equations.  As  in  the  LTV  case,  one  should  avoid  differentiating  a  computed 
quantity.  Each  of  these  issues  provides  a  strong  argument  for  developing  algorithms 
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based  on  the  unreduced  problem.  As  in  the  LTV  problem,  adapting  the  algorithm, 
confirming  the  theory,  and  testing  examples  remains  as  future  research. 

5.1.4  Independent  Noise  Bounds 

Systems  may  have  an  added  complexity  in  that  noise  channels  are  independently 
bounded.  Thus,  the  expression  for  the  bound  on  the  noise  changes  from 

llrf<l,  i  =  0,l  (5.8) 

to 

ll/iyll^  <  1,  i  =  0,l,  j  =  l,...ni  (5.9) 

where  j  is  the  channel  and  Ui  is  the  number  of  noise  channels  in  model  i.  This  addi¬ 
tional  complexity  has  a  far  reaching  impact  on  the  problem.  First,  (5.9)  is  equivalent 
to  \\pi\\lo  <  1)  \\pi\\cx>  is  not  a  strictly  convex  norm.  Also,  the  construction  of 

the  auxiliary  cost  function,  J„(/5),  must  now  accomodate  no  +  ni  terms.  Finally,  the 
translation  of  the  cost  function  into  those  terms  necessary  for  the  application  of  op¬ 
timal  control  theory  may  or  may  not  be  possible.  Much  work  remains  undone  in  this 
area,  including  the  possible  requirement  of  all  new  theory. 

5.1.5  Sensitivity  Issues 

While  we  have  talked  about  the  sensitivity  of  the  separating  hyperplane  to  the  value 
of  e,  we  have  ignored  other  sensitivity  issues.  One  is  often  interested  in  the  sensitivity 
of  the  solution  to  perturbations  of  the  system  coefficient  matrices  A,,  Bi,  and  so  on. 
If  coefficients  are  slightly  inaccurate,  then  we  would  like  to  say  that  the  test  function 
from  the  FDMI  algorithm  is  still  usable,  but  subject  to  some  error.  A  sensitivity 
study  identifies  bounds  on  the  error  due  to  coefficient  perturbations.  It  should  be 
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noted  that,  as  in  the  small  nonlinearities  case  discussed  above,  the  algorithm  can  be 
modified  to  be  robust  to  bounded  perturbations  of  the  system  coefficients  via  the 
addition  of  more  noise  terms  resulting  in  a  higher  energy  than  necessary  detection 
signal.  In  that  light,  the  development  of  some  results  in  the  sensitivity  issue  will 
closely  follow  those  of  the  small  nonlinearities  problem.  As  an  alternative,  one  could 
consider  system  matrices  Ai  +  AAj,  Bi  +  ABj,  and  so  on,  where  AAi  and  ABi  are 
small.  System  models  for  this  approach  are  straightforward  to  write  down,  but  will 
involve  many  more  parameters  than  the  unperturbed  models.  The  optimal  solution 
will  depend  on  the  perturbations,  providing  the  sensitivity  information  one  seeks,  but 
solving  the  problem  will  be  computationally  expensive.  Regardless  of  the  approach 
taken,  much  more  work  on  sensitivity  issues  remains. 

5.2  Conclusions 

Our  goals  for  this  thesis  were  to  apply  the  multi-model  approach  to  fault  detection  and 
model  identification  in  linear  descriptor  systems,  modeling  noise  as  bounded  energy 
signals,  proving  that  this  combination  is  a  valid  and  efficient  tool  for  these  types  of 
problems,  and  to  develop  an  algorithm  that  demonstrates  perfect  fault  detection  and 
model  identification.  As  we  saw  in  Chapter  1,  the  combination  of  the  multi-model 
approach  and  the  bounded  energy  noise  model  is  under- explored,  in  that  very  little 
existing  literature  reports  their  combined  use.  Most  work  in  the  multi-model  approach 
uses  statistical  noise  models  which,  while  theoretically  attractive,  do  not  provide  the 
computational  friendliness  to  optimization  software  that  the  bounded  energy  noise 
models  do.  The  remaining  theory  involves  the  single-model  approach,  utilizing  either 
feedback  or  observer  design  for  imperfect  fault  detection  and  model  identification. 

After  laying  the  groundwork  for  the  types  of  systems  to  be  addressed  and  the 
numerical  methods  to  be  applied  to  those  systems  in  Chapter  1,  we  developed  the 
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theory  for  the  first  half  of  the  problem,  fault  detection,  in  Chapter  2.  There,  after 
defining  the  minimum  energy  detection  signal,  we  constructed  the  auxiliary  cost  func¬ 
tion.  We  then  translated  the  problem  into  a  nested  optimization  problem  suitable  for 
implementation  in  SOCS,  solving  for  the  necessary  conditions  for  a  minimum  of  the 
inner  problem,  and  then  using  those  conditions  as  constraints  on  the  outer  problem. 
We  presented  four  forms  of  the  MEDS  algorithm,  two  of  which  apply  matrix  Riccati 
differential  equation  theory,  which  is  well-suited  to  the  limiting  half-infinite  interval 
case.  After  stating  the  algorithm,  and  the  software  in  which  we  implemented  it,  we 
addressed  several  variations  of  the  problem  for  which  the  algorithm  is  well  suited, 
most  notably  the  multiple  fault  model  case,  with  which  prior  work  in  this  area  is  not 
compatible. 

Next  we  turned  to  the  second  half  of  the  problem,  model  identification.  Given  an 
output,  our  detection  signal  ensured  that  only  one  model  could  have  produced  it.  In 
Chapter  3  we  developed  the  theory  for  the  separating  hyperplanc  for  the  output  sets. 
We  reduced  the  input  of  the  noise  to  the  system  in  order  to  translate  the  problem  into 
an  optimization  problem,  the  solution  of  which  gave  an  approximation  to  the  normal 
of  the  separating  hyperplane.  The  equation  of  the  normal  was  implemented  as  a 
test  function  in  the  MI  algorithm.  After  stating  the  algorithm,  we  again  addressed 
variations  of  the  problem,  and  showed  that  the  second  half  of  the  algorithm  is  as 
equally  suited  to  those  variations  as  the  first  half. 

Chapter  4  was  dedicated  to  the  presentation  and  analysis  of  various  types  of  ex¬ 
amples.  We  applied  our  implementation  of  the  FDMI  algorithm  to  one-,  two-,  and 
three-dimensional  examples,  demonstrating  comparable  performance  on  all.  A  mul¬ 
tiple  fault  model  case  was  examined,  and  the  capability  of  the  algorithm  in  this  area 
was  shown  to  be  as  expected.  In  fact,  the  only  unexpected  result  came  from  an  exam¬ 
ple  that  was  constructed  to  force  the  SOCS  version  of  the  algorithm  to  fail.  Instead 
of  failing,  the  algorithm  worked,  demonstrating  unanticipated  robustness.  Despite 
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this  success,  some  indications  of  difficulty  were  uncovered.  Initial  guess  construction, 
starting  mesh  size,  objective  function  scaling,  and  the  selection  of  e  each  showed  signs 
of  becoming  factors  in  certain  problems  but  not  in  others.  These  issues  should  be 
resolved  before  any  real-world  application  of  the  algorithm. 

Finally,  in  the  present  chapter,  we  introduced  several  extensions  of  the  FDMI 
algorithm,  describing  the  basics  of  each,  but  leaving  the  bulk  of  the  work  to  future 
research.  The  tools  developed  in  this  thesis  show  much  promise  for  numerically  solv¬ 
ing  the  fault  detection  and  model  identification  problem  in  linear  descriptor  systems. 
Where  previously  there  existed  few  multi-model  methods  for  dealing  with  short  deci¬ 
sion  horizons,  this  work  provides  a  new  approach  for  online  fault  detection  and  model 
identification. 
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Appendix  A 


Software  Drivers 


Several  pieces  of  commercial  software  can  be  combined  to  implement  the  FDMI  algo¬ 
rithm  presented  in  this  thesis.  Model  reduction  is  efficiently  accomplished  in  MAT- 
LAB,  by  The  MathWorks,  Inc.  [26].  FORTRAN  code  generation  of  the  reduced 
system  model  is  done  in  MAPLE,  by  Waterloo  Maple,  Inc.  [18].  Optimization  is  car¬ 
ried  out  in  SOCS,  by  Boeing  [3,  4]  for  both  the  minimum  energy  detection  signal  and 
the  normal  to  the  separating  hyperplane.  MATLAB  is  used  to  analyze  the  output 
from  SOCS  and  create  plots.  Sample  driver  files  for  each  phase  of  this  process  are 
included  below. 


A.l  Model  Reduction 


The  transformation  of  the  original  system  to  the  reduced  model  discussed  in  Chapter 
2  is  aceomplished  by  the  following  MATLAB  m-file. 

7,reduced2d.m 

7, current  parameters  for:  common  mode  example  (examlO) 

7.system  matrices 
A0=[1  0; 

0  2]  ; 

Al=[l  0; 
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0  3]  ; 

B0=[2  1; 

1  2]; 

Bl=[2  1; 

1  2]; 

C0=[1  0; 

0  1]; 

Cl=[l  0; 

0  1]; 

M0=[0  0  0  1; 

0010]; 

M1=[0  0  0  1; 

0  0  10]; 

N0=[0  1  0  0; 

1  0  0  0]; 

N1=C0  100; 

1  0  0  0]; 

‘/.may  need  to  do  constant  orthogonal  change  of  coords  on  the  noise 
’/.do  it  via  a  QR  decomp  on  N_i~T  to  get  [Nb_i  0]  , 

‘/.where  Nb_i  is  invertible,  also  gives  [Mb_i  Mt_i] 

[Q0,R0]=qr(N0’) 

[Ql,Rl]=qr(Nl’) 

pause 

‘/.now  N.i'’!  =  Q_i  *  R_i,  so  N_i  =  R_i~T  *  Q.i"! 

‘/.and  Q.i'T  is  an  orthogonal  matrix 

‘/.absorb  Q_i''T  into  the  noise  vector  nu_i  to  get  new  noise  vector 
‘/.and  Nb_i  becomes  the  invertible  part  of  R_i''T 

‘/.may  need  to  fix  signs  in  Q_i  and/or  R_i 
R0=-R0 ; 

R1=-R1; 

Q0(1,1)=-Q0(1,1); 

Q0(2,2)=-Q0(2,2); 

Q1(1,1)=-Q1(1,1); 

qi(2,2)=-Ql(2,2); 

'/.break  down  into  Mb_i,  Mt_i,  Nb_i,  and  Nt_i  (zeros) 
[mN0,nN0]=size(N0) ; 

[mNl ,nMl]=size(Nl) ; 
mnN0=min(mN0 ,nN0) ; 
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iraiNl=min(mNl  ,nNl) ; 
inxNO=max(mNO,nNO) ; 
mxNl=max(mNl ,nNl) ; 

NOn=RO’; 

Nln=Rl' ; 

NbO=NOn(l :ranNO , 1 :mnNO) ; 

Nbl=Nln(l:mnNl,l:nmNl) ; 

NtO=NOn( :  .mnNO+l  :mxNO) ;  '/.just  need  the  size  of  this 
Ntl=Nln( :  ,innNl+l  :mxNl) ;  7, and  this 

M0n=M0*Q0’ ; 

Mln=Ml*Ql' ; 

Mb0=M0n(:  ,l:innN0); 

Mbl=Mln(: ,l:nmNl); 

Mt0=M0n( : ,mnN0+l rmxNO) ; 

Mtl=Mln( :  ,innNl+l  :mxNl) ; 

7, create  reduced  model  system  matrices 
[mAO.nAO] =size(A0) ; 

[mAl,nAl]=size(Al) ; 

[mMtO,nMtO]=size(MtO) ; 

[mMtl,nMtl]=size(Mtl) ; 

[mNb,nNb]=size(Nbl) ; 

[mNtO,nNtO]=size(NtO) ; 

[mNtl ,nNtl]  =size (Ntl) ; 

A= [AO-MbO*inv(NbO)*CO  MbO*inv(NbO) *C1 ;  zeros (mAl ,nAO)  Al] 

M=[MtO  MbO*inv(NbO)*Nbl  zeros (mMtO ,nMtl) ;  zeros (mMtl ,nMtO)  Mbl  Mtl] 
B=[B0:B1] 

C=[inv(NbO)*CO  -inv(NbO)*Cl] ; 

N=inv(NbO)* [zeros (mNb.nNtO)  Nbl  zeros(mNb,nNtl)] ; 

7.Q  and  H  without  beta 

Qnb=2*C’*C 

Hnb=-4*C'*N 

7tSize  of  upper  left  block  of  R 
ulident=nNtO 

7.Nbl’NbO’ "-INbO'-lNbl  for  the  center  block  of  R 
Nbl 

Nmess=Nbl ’ *inv(NbO’ )*inv(NbO) *Nbl 
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7oSize  of  lower  right  block  of  R 
lrident=nNtl 

y.end  of  routine 

Outputs  from  this  routine  are  used  as  follows: 

•  sizes  of  reduced  model  system  matrices  are  inputted  into  MAPLE, 

•  parameters  of  the  R  matrix  are  inputted  into  MAPLE, 

•  parameters  of  the  reduced  model  system  matrices  are  inputted  into  SOCS. 


A. 2  Fortran  Code  Generation 

The  following  MAPLE  routine  takes  the  sizes  of  reduced  model  system  matrices  and 
the  parameters  of  the  R  matrix  and  creates  FORTRAN  code  for  pasting  into  the 
SOCS  FORTRAN  driver. 

“/oreduced2d.mws 

>  restart; 

>  with(linalg) : 

"/pSizes  of  reduced  system  matrices 

>  A :=matrix(4,4) ; 

>  B:=matrix(4,l) ; 

>  M:=matrix(4,5) ; 

>  Q :=matrix(4,4) ; 

>  H:=matrix(4,5) ; 

>  Nmess:=matrix(l,l) ; 

'/pStructure  of  the  R  matrix 

>  R:=matrix(5,5, [2*P[1] ,0,0,0,0,0,2*P[1] ,0,0,0,0,0,2*(l-PCl] )+ 
2*P[l]*Nmess[l,l] , 0 ,0,0,0 ,0,2* (1-P [1] ) ,0,0,0 ,0,0,2* (1-P [1] )]) ; 

7.S0CS  needs  all  variables  in  a  single  vector 

>  Z:=vector(15) ; 

>  X :  =vector  (4 ,  [Z  [1]  ,  Z  [2]  ,  Z  [3]  ,  Z  [4]  ]  )  ; 

>  lambda :=vector (4,  [Z[5] ,Z[6] ,Z[7] ,Z[8]]) ; 

>  nu:=vector(5,  [Z[10] ,Z[11] ,Z[12] ,Z[13] ,Z[14]]) ; 
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>  v:=vector(l,  [Z[15]]) ; 

’/.create  constraint  equations 

>  FI :=matadd(matadd(inultiply(A,x) .multiply (B,v) ) ,multiply(M,nu) ) ; 

>  F2 :=mat add (mat add (multiply (-P [1] *Q ,x) .multiply (-P [1] *H.nu) /2) . 
multiply(transpose(-A) .lambda)) ; 

>  F3:  =  (multiply (multiply (transpose (x) .P[l]  *Q) .x) 

+multiply (multiply (transpose (x) .P[l] *H) .nu) 
tmultiply (multiply (transpose (nu) .R) .nu))/2; 

>  F4:=matadd(matadd (multiply (R,nu) . 

multiply (transpose (P[l] *H) .x)/2) .multiply (transpose (M) .lambda)) ; 

*/, translate  constraint  equations  into  FORTRAN  code 

>  fortran(Fl); 

>  fortran(F2); 

>  fortran(F3); 

>  fortran(F4); 

7, end  of  routine 


A. 3  Optimization  via  the  FDMI  Algorithm 

The  following  SOCS  driver  takes  the  FORTRAN  code  output  by  the  above  MAPLE 
routine,  as  well  as  the  parameters  of  the  reduced  model  system  matrices  from  the 
above  MATLAB  routine,  to  accomplish  both  halves  of  the  FDMI  algorithm. 

C  Main  driver:  2dex4m.f 

PROGRAM  MID42D 

C  Generic  2D  FDMI  problem  on  [TO.TF]  in  formulation  2nR 
C  using  min  distance  between  convex  set  theory 

C  Need  to  set  problem  parameters  in  seven  places: 

C  a  few  lines  down  from  here,  and  in  0PT2IN,  MID2IN  (T0,TF  only) 

C  near  end  of  MAIN  (output  file  names  in  open  calls) 

C  in  0PT2DE  (all  coordinate  changed  parameters) 

C  in  MID2DE  (all  original  problem  parameters  plus  epsilon  (EPSIL)) 

C  in  function  YBAR(T)  (coordinate  changed  parameters  Cl  and  NBl) 
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C  current  parameters  are  for:  multimod2d_12 
INTEGER  I , NIW , MAXPHS , NW , MAXCS , MAXDP 

PARAMETER  (NIW  =  10000000 , MAXPHS  =  5,NW  =  10000000 ,MAXCS=100000) 
PARAMETER  (MAXDP  =  1000) 

INTEGER  IW(NIW) ,IPCPH(MAXPHS+1) ,IPDPH(MAXPHS+1) .NEEDED, lER 
DOUBLE  PRECISION  TO,TF,T,W(NW) ,CSTAT(MAXCS) ,DPARM(MAXDP) .YEAR 
PARAMETER  (T0=0.D0,TF=1 .DO) 

C  OCSEVL  will  be  called  during  second  SOCS  run  (MI)  to  extract 
C  output  from  first  SOCS  run  (MEDS)  needed  in  MI  optimization,  so  need 
C  to  save  some  of  the  parameters  from  first  SOCS  run  that  eire  args 
C  to  OCSEVL  to  new  variable  names  as  defined  below 

INTEGER  IPCPH2(MAXPHS+1) ,IPDPH2(MAXPHS+1) 

DOUBLE  PRECISION  CSTAT2 (MAXCS) , DP ARM2 (MAXDP) ,W2(NW) 


C  Also  need  them  to  be  common  because  SOCS  will  not  pass  them 
C  to  the  functions  that  use  them 

COMMON  CSTAT2 , IPCPH2 , DPARM2 , IPDPH2 , W2 

INTEGER  IPHASE , NDP , lOUNIT , lOFLAG , NPTAUX , MXSTAT , NDYN 
PARAMETER  (MXSTAT  =  20) 

DOUBLE  PRECISION  STSKL(0 : MXSTAT) ,DTAUX 

EXTERNAL  0PT2IN , MID2IN .DUMYIG , 0PT2DE ,MID2DE 
EXTERNAL  DUMYPF , DUMYPR 


C  Output  verbosity 
CC  CALL  HHS0CS('IPGRD=O’) 

C  default=10 

C  CALL  HHS0CS(’IPFSFD=0’)  default=0 

CC  CALL  HHS0CS(’IPNLP=0’) 

C  default=10 

C  CALL  HHS0CS(’IP0DE=O’)  default=0 

C  CALL  HHSNLP(’I0FLAG=0’) 

C  default=10 

CALL  HHSNLP ( ' MAXNFE=50000 ' ) 

CALL  HHSNLP  (’NITMAX=1000O 
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CALL  HHSNLP(’KTOPTN=SMALL’) 

CALL  HHSNLP ( ’ ALGOPT=FM ’ ) 

CALL  HHS0CS(’MIT0DE=15’) 

CALL  HHSOCS(’ODETOL=l.D-7’) 

CALL  HHSOCS(’SPRTHS=SPARSE') 

CALL  HHS0CS(’MTSWCH=3>) 

CALL  HHS0CS(’NSSWCH=1’) 

C  Optional  tolerances  for  quicker  convergence 
C  CALL  HHS0CS(’0DETOL=0.2D-4’) 

C  CALL  HHSNLP (’C0NT0L=l.D-5’) 

C  CALL  HHSNLP (’ OB JT0L=l.D-5’) 

WRITE (*,*)  ’MADE  IT:  START  OF  MIN  V  PROBLEM’ 

C  OPTIMIZATION  TO  FIND  MIN  ENERGY  PROPER  V  -  first  SOCS  call 

CALL  HDSOCS (OPT2IN , DUMYIG , 0PT2DE , DUMYPF .DUMYPR , 

+  IW , NIW , W , NW , MAXPHS , CSTAT , MAXCS , IPCPH , DPARM , MAXDP , 

+  IPDPH, NEEDED, lER) 

C  Transfer  args  for  OCSEVL 

DO  10  I  =  1, MAXCS 
CSTAT2(I)  =  CSTAT (I) 

10  CONTINUE 

DO  20  I  =  1,MAXPHS+1 
IPCPH2(I)  =  IPCPH(I) 

IPDPH2(I)  =  IPDPH (I) 

20  CONTINUE 

DO  30  I  =  1, MAXDP 
DPARM2(I)  =  DPARM(I) 

30  CONTINUE 

DO  40  I  =  1,NW 
W2(I)  =  W(I) 

40  CONTINUE 


C  Output  YBAR  and  optimal  solution  from  MEDS  phase 
I0UNIT=32 

OPEN ( lOUNIT , FILE= ’ thrmod2d_lyb . dat ’ , STATUS= ’ UNKNOWN ’ ) 
DO  50  I  =  1,251 
T  =  T0+(TF-T0)*(I-1)/2.5D2 
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WRITEdOUNIT,*)  YBAR1(T),YBAR2(T),YBAR3(T) 

50  CONTINUE 

CLOSE (lOUNIT) 

OPEN ( lOUNIT , FILE= ’ thrmod2d_lds . dat ’ , ST ATUS= ’UNKNOWN ’ ) 

NPTAUX  =  250 
NDP  =  15 
NDYN  =  16 
DTAUX  =  0 
lOFLAG  =10 
IPHASE  =  1 

CALL  DFILL(NDYN,1-D0,STSKL,1) 

CALL  AUXOUT ( IPHASE , CSTAT , DPARM , NDP , STSKL , W , NW , lOUNIT , 

+  I OFLAG, NPTAUX, DTAUX.DUMYPR) 

CLOSE(IOUNIT) 

C  OPTIMIZATION  TO  FIND  TEST  FUNCTION  -  second  SOCS  call 

WRITE (*,*)  ’MADE  IT:  START  OF  TEST  FN  PROBLEM’ 

CALL  HDSOCS (MID2IN , DUMYIG , MID2DE , DUMYPF , DUMYPR , 

+  IW , NIW , W , NW , MAXPHS , CSTAT , MAXCS , IPCPH , DPARM , MAXDP , 

+  IPDPH, NEEDED, lER) 


C  Output  complete  optimal  solution  from  MI  phase 
lOUNIT  =  32 

OPEN ( I OUNIT , F ILE= ’ thrmod2d_ lm7 . dat ’ , STATUS= ’ UNKNOWN ’ ) 

NPTAUX  =  250 

NDP  =  14 

NDYN  =15 

DTAUX  =  0 

lOFLAG  =  10 

IPHASE  =  1 

CALL  DFILL (NDYN, 1. DO, STSKL, 1) 

CALL  AUXOUT (IPHASE , CSTAT , DPARM , NDP , STSKL , W , NW , lOUNIT , 
+  I OFLAG , NPTAUX , DTAUX , DUMYPR) 

CLOSE(IOUNIT) 


END 
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C  MEDS  phase  driver 

SUBROUTINE  0PT2IN ( IPHASE , NPHS , METHOD , MSTG , NCF , NPF , NPV , NAV , 

+  NGRID , INIT , MAXMIN , MXPARM , PO , PLB , PUB , PLBL , 

+  MXSTAT , YO , Y1 , VLB , YUB , STSKL , STLBL , MXPCON , CLB , 

+  CUB , CLBL .MXTERM , COEF , ITERM , TITLE , lER) 

INTEGER  IPHASE, NPHS, METHOD, NSTG, NCF (3) ,NPF(2) , NPV, NAV, NGRID, 

+  INIT, MAXMIN, MXPARM, MXSTAT, MXPCON, MXTERM, ITERM(4, MXTERM) , 

+  lER 

DOUBLE  PRECISION  PO (MXPARM) , PLB (MXPARM) , PUB (MXPARM) ,Y0(0: MXSTAT) , 
+  Y1 (0 : MXSTAT) , YLB (-1 : 1 , 0 : MXSTAT) , YUB (-1 : 1 ,0 : MXSTAT) , 

+  STSKL(0:MXSTAT+MXPARM,2) , CLB (MXPCON) , CUB (MXPCON) , 

+  COEF (MXTERM) 

CHARACTER  PLBL(MXPARM+2) *80 , STLBL(0 : MXSTAT) *80 , 

+  CLBL(0: MXPCON) *80, TITLE(3) *60 

DOUBLE  PRECISION  TO,TF 
PARAMETER  (T0=0 .D0,TF=1 .DO) 

METHOD  =  3 
NSTG  =  1 
NCF(l)  =  9 
NCF (2)  =  5 
NCF (3)  =  1 
NPF (2)  =  0 
NAV  =  6 
NPV  =  1 
NGRID  =  11 
INIT  =  1 
MAXMIN  =  -1 


C  Initial  and  final  time 

Y0(0)  =  TO 
Y1(0)  =  TF 

C  Initial  guesses,  v  =  1,  beta  somewhere  in  the  interior  of  [0,1] 


DO  10  I  =  1,14 
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YO(I)  =  O.DO 
10  CONTINUE 

Y0(15)  =  l.DO 

DO  20  I  =  1,14 
Y1(I)  =  O.DO 
20  CONTINUE 

Yl(15)  =  l.DO 

P0(1)  =  0.2DO 


C  Fix  the  boundary  conditions 

DO  30  I  =  5,9 
YLB(-1,I)  =  O.DO 
YUB(-1,I)  =  O.DO 
30  CONTINUE 

DO  40  I  =  5,8 
YLB(1,I)  =  O.DO 
YUB(1,I)  =  O.DO 
40  CONTINUE 

YLB(1,9)  =  l.DO 


C  Bound  BETA  (avoiding  singularities  at  0  and  1) 


PLB(l)  =  O.OIDO 
PUB(l)  =  0.99D0 

C  Fix  the  start  and  finish  time 


YLB(-1,0)  =  Y0(0) 
YUB(-1,0)  =  Y0(0) 
YLB(1,0)  =  Y1(0) 
YUB(1,0)  =  Y1(0) 


C  Equality  constraints 

DO  50  I  =  1,5 
ITERM(1,I)  =  I 
ITERM(2,I)  =  1 
ITERM(3,I)  =  0 
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ITERM(4,I)  =  -I 
CLB(I)  =  O.DO 
CUB (I)  =  O.DO 
50  CONTINUE 


C  Objective  function 

ITERM(1,6)  =  0 
ITERM(2,6)  =  1 
ITERM(3,6)  =  0 
ITERM(4,6)  =  -6 

RETURN 

END 


C  MEDS  constraint  definition 


SUBROUTINE  0PT2DE ( IPHASE , T , Z , NZ , P , NP , F , NF , IFERR) 

INTEGER  IPHASE, NZ,NP,NF, IFERR 
DOUBLE  PRECISION  T,Z(NZ) ,P(NP) ,F(NF) 

DOUBLE  PRECISION  A(4,4) ,B(4,1) ,S1,S2 

DOUBLE  PRECISION  M(4,5) ,H(4,5) ,Q(4,4) ,Nmess(l,l) 


C  Nine  ODE  constraints 

C  Variable  list:  x:Z(l)-Z(4),  lambda:Z(5)-Z(8) ,  Z:Z(9), 
C  nu:Z(10)-Z(14) ,  v:Z(15),  beta:P(l) 


INTEGER  R.C 
C  Problem  parameters 

DATA  ((A(R,C),C=1,4) ,R=1 ,4)/-l .DO, 1 .D0,0.D0,0 .DO, 

#-l. DO, -3. DO, O.DO, O.DO, 

#0. DO, O.DO, -10. DO, 1. DO, 

#0. DO, 0. DO, -1. DO, -3. DO/ 

DATA  ((B(R,C) ,C=1,1),R=1,4)/1.D0,0.D0,1.D0,0.D0/ 

DATA  ((M(R,C) ,C=1,5),R=1,4)/O.DO,1.DO,O.DO,O.DO,O.DO, 
#1. DO, O.DO, O.DO, O.DO, O.DO, 

#0.D0,0.D0,0.D0,0.D0,1.D0, 

#0. DO, 0. DO, 0. DO, 1. DO, O.DO/ 
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DATA  ((Q(R,C) ,C=1,4) ,R=1,4)/2.D0,2.D0,-2.D0,-2.D0, 

#2. DO, 2. DO, -2. DO, -2. DO, 

#-2. DO, -2. DO, 2. DO, 2. DO, 

#-2. DO, -2. DO, 2. DO, 2. DO/ 

DATA  ((H(R,C) ,C=1,5) ,R=1,4)/O.DO,O.DO,-4.DO,O.DO,O.DO, 
#O.DO,O.DO,-4.DO,O.DO,O.DO, 

#O.DO,O.DO,4.DO,O.DO,O.DO, 

#O.DO,O.DO,4.DO,O.DO,O.DO/ 

DATA  ((Nmess(R,C) ,C=1,1) ,R=l,l)/l.DO/ 


F(l)  =  A(1,1)*Z(1)+A(1,2)*Z(2)+A(1,3)*Z(3)+A(1,4)*Z(4)+B(1,1)*Z(1 
#5)+M(l,l)*Z(10)+M(l,2)*Z(ll)+M(l,3)*Z(12)+M(l,4)*Z(13)+M(l,5)*Z(14 
#) 

F(2)  =  A(2,1)*Z(1)+A(2,2)=kZ(2)+A(2,3)*Z(3)+A(2,4)*Z(4)+B(2,1)*Z(1 
#5)+M(2,l)*Z(10)+M(2,2)*Z(ll)+M(2,3)*Z(12)+M(2,4)*Z(13)+M(2,5)*Z(14 

#) 

F(3)  =  A(3,1)*Z(1)+A(3,2)*Z(2)+A(3,3)*Z(3)+A(3,4)*Z(4)+B(3,1)*Z(1 
#5)+M(3,l)*Z(10)+M(3,2)*Z(ll)+M(3,3)*Z(12)+M(3,4)*Z(13)+M(3,5)*Z(14 
#) 

F(4)  =  A(4,1)*Z(1)+A(4,2)*Z(2)+A(4,3)*Z(3)+A(4,4)*Z(4)+B(4,1)*Z(1 
#5)+M(4,l)*Z(10)+M(4,2)*Z(ll)+M(4,3)*Z(12)+M(4,4)*Z(13)+M(4,5)*Z(14 
#) 

C  lambda' 

F(5)  =  -P(l)*Q(l,l)*Z(l)-P(l)*Q(l,2)*Z(2)-P(l)*q(l,3)*Z(3)-P(l)*Q 
#(l,4)*Z(4)-P(l)*H(l,l)*Z(10)/2-P(l)*H(l,2)*Z(ll)/2-P(l)*H(l,3)*Z(l 
#2)/2-P(l)*H(l,4)*Z(13)/2-P(l)*H(l,5)*Z(14)/2-A(l,l)*Z(5)-A(2,l)*Z( 
#6)-A(3,l)*Z(7)-A(4,l)*Z(8) 

F(6)  =  -P(1)*Q(2,1)*Z(1)-P(1)*Q(2,2)*Z(2)-P(1)*Q(2,3)*Z(3)-P(1)*Q 
#(2,4)*Z(4)-P(l)*H(2,l)*Z(10)/2-P(l)*H(2,2)*Z(ll)/2-P(l)*H(2,3)*Z(l 
#2)/2-P(l)*H(2,4)*Z(13)/2-P(l)*H(2,5)*Z(14)/2-A(l,2)*Z(5)-A(2,2)*Z( 
#6)-A(3,2)*Z(7)-A(4,2)*Z(8) 

F(7)  =  -P(l)*q(3,l)*Z(l)-P(l)*Q(3,2)*Z(2)-P(l)*Q(3,3)*Z(3)-P(l)*Q 
#(3,4)*Z(4)-P(l)*H(3,l)*Z(10)/2-P(l)*H(3,2)*Z(ll)/2-P(l)*H(3,3)*Z(l 
#2)/2-P(l)*H(3,4)*Z(13)/2-P(l)*H(3,5)*Z(14)/2-A(l,3)*Z(5)-A(2,3)*Z( 
#6)-A(3,3)*Z(7)-A(4,3)*Z(8) 

F(8)  =  -P(l)*q(4,l)*Z(l)-P(l)*q(4,2)*Z(2)-P(l)*Q(4,3)*Z(3)-P(l)*q 
#(4,4)*Z(4)-P(l)*H(4,l)*Z(10)/2-P(l)*H(4,2)*Z(ll)/2-P(l)*H(4,3)*Z(l 
#2)/2-P(l)*H(4,4)*Z(13)/2-P(l)*H(4,5)*Z(14)/2-A(l,4)*Z(5)-A(2,4)*Z( 
#6)-A(3.4)*Z(7)-A(4,4)*Z(8) 
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S2  =  (P(1)*Q(1,1)*Z(1)+Z(2)*P(1)*Q(2,1)+Z(3)*P(1)*Q(3,1)+Z(4)*P(1) 
#*Q(4,l))*Z(l)/2+(Z(l)*P(l)*Q(l,2)+P(l)*Q(2,2)*Z(2)+Z(3)*P(l)*Q(3,2 
#)+Z(4)*P(l)*Q(4,2))*Z(2)/2+(Z(l)*P(l)*Q(l,3)+Z(2)*P(l)*Q(2.3)+P(l) 
#*Q(3,3)*Z(3)+Z(4)*P(l)*Q(4,3))*Z(3)/2 
SI  =  s2+(Z(l)*P(l)*Q(l,4)+Z(2)*P(l)*Q(2,4)+Z(3)*P(l)*Q(3,4)+P(l)*Q 
#(4,4)*Z(4))*Z(4)/2+(Z(l)*P(l)*H(l,l)+Z(2)*P(l)*H(2,l)+Z(3)*P(l)*H( 
#3,l)+Z(4)*P(l)*H(4,l))*Z(10)/2+(Z(l)*P(l)*H(l,2)+Z(2)*P(l)*H(2,2)+ 
#Z(3)t‘P(l)*H(3,2)+Z(4)*P(l)*H(4,2))*Z(ll)/2+(Z(l)*P(l)*H(l,3)+Z(2)* 
#P(l)*H(2,3)+Z(3)*P(l)*H(3,3)+Z(4)*P(l)*H(4,3))tZ(12)/2 
F(9)=sl+(Z(l)*P(l)*H(l,4)+Z(2)*P(l)*H(2,4)+Z(3)*P(l)*H(3,4)+Z(4)*P 
#(l)*H(4,4))*Z(13)/2+(Z(l)*P(l)*H(l,5)+Z(2)*P(l)*H(2,5)+Z(3)*P(l)*H 
#(3,5)+Z(4)*P(l)*H(4,5))*Z(14)/2+Z(10)**2*P(l)+Z(ll)**2*P(l)+Z(12)* 
#*2*(2-2*P(l)+2*P(l)*Nmess(l,l))/2  +Z(13)**2*(2-2*P(l))/2+Z(14)**2* 
#(2-2*P(l))/2 


C  Five  algebraic  constraints 

F(10)  =  2*Z(10)*P(l)+Z(l)*P(l)*H(l,l)/2+Z(2)*P(l)*H(2,l)/2+Z(3)*P( 
#l)*H(3,l)/2+Z(4)*P(l)*H(4,l)/2+M(l,l)*Z(5)+M(2,l)*Z(6)+M(3,l)+Z(7) 
#+M(4,l)*Z(8) 

F(ll)  =  2*Z(ll)*P(l)+Z(l)*P(l)»H(l,2)/2+Z(2)*P(l)*H(2,2)/2+Z(3)*P( 
#l)*H(3,2)/2+Z(4)*P(l)*H(4,2)/2+M(l,2)*Z(5)+M(2,2)*Z(6)+M(3,2)*Z(7) 
#+M(4,2)*Z(8) 

F(12)  =  Z(12)*(2-2*P(l)+2>fP(l)*Nmess(l,l))  +Z(l)*P(l)*H(l,3)/2+Z(2 
#)*P(l)*H(2,3)/2+Z(3)*P(l)*H(3,3)/2+Z(4)*P(l)*H(4,3)/2+M(l,3)*Z(5)+ 
#M(2,3)*Z(6)+M(3,3)*Z(7)+M(4,3)*Z(8) 

F(13)  =  Z(13)*(2-2*P(l))+Z(l)*P(l)*H(l,4)/2+Z(2)*P(l)*H(2,4)/2+Z(3 
#)*P(l)*H(3,4)/2+Z(4)*P(l)*H(4,4)/2+M(l,4)*Z(5)+M(2,4)*Z(6)+M(3,4)* 
#Z(7)+M(4,4)*Z(8) 

F(14)  =  Z(14)*(2-2*P(l))+Z(l)*P(l)*H(l,5)/2+Z(2)*P(l)*H(2,5)/2+Z(3 
#)*P(l)*H(3,5)/2+Z(4)*P(l)*H(4,5)/2+M(l,5)*Z(5)+M(2,5)*Z(6)+M(3,5)* 
#Z(7)+M(4,5)*Z(8) 


C  Objective  function  is  a  quadrature 
F(15)  =  Z(15)*f2 

RETURN 

END 


C  MI  phase  driver 


Appendix  A.  Software  Drivers 


158 


SUBROUTINE  MID2IN ( IPHASE , NPHS , METHOD , NSTG , NCF , NPF , NPV , NAV , 

+  NGRID , INIT , MAXMIN , MXPARM , PO , PLB , PUB , PLBL , 

+  MXSTAT , YO , Y 1 , YLB , YUB , STSKL , STLBL , MXPCON , CLB , 

+  CUB , CLBL , MXTERM , COEF , ITERM , TITLE , lER) 

INTEGER  IPHASE, NPHS, METHOD, NSTG, NCF (3) ,NPF(2) , NPV, NAV, NGRID, 

+  INIT, MAXMIN, MXPARM, MXSTAT, MXPCON, MXTERM, ITERM (4, MXTERM) , 

+  lER 

DOUBLE  PRECISION  PO(MXPARM) ,PLB(MXPARM) ,PUB(MXPARM) ,Y0(0:MXSTAT) , 
+  Y 1 ( 0 : MXST AT) , YLB ( - 1 : 1 , 0 : MXSTAT) , YUB ( - 1 : 1 , 0 : MXSTAT) , 

+  STSKL(0:MXSTAT+MXPARM,2) ,CLB(MXPCON) ,CUB(MXPCON) , 

+  COEF (MXTERM) 

CHARACTER  PLBL(MXPARM+2) *80 , STLBL(0 : MXSTAT) *80 , 

+  CLBL(O:MXPC0N)*80,TITLE(3)*60 

DOUBLE  PRECISION  TO,TF 
PARAMETER  (T0=0.D0,TF=1 .DO) 

METHOD  =  3 
NSTG  =  1 
NCF(l)  =  6 
NCF(2)  =  2 
NCF(3)  =  1 
NPF(2)  =  0 
NAV  =  8 
NPV  =  0 
NGRID  =11 
INIT  =  1 
MAXMIN  =  -1 


C  Initial  and  final  time 

Y0(0)  =  TO 
Y1(0)  =  TF 


C  Initial  guesses 


DO  10  I  =  1,13 
YO(I)  =  l.DO 
10  CONTINUE 
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YO(14)=O.DO 

DO  20  I  =  1,13 
Y1(I)  =  O.DO 
20  CONTINUE 

Y1(14)=1.D0 


C  Boundary  conditions 

DO  30  I  =  5,6 
YLB(-1,I)  =  O.DO 
YUB(-1,I)  =  O.DO 
30  CONTINUE 

DO  40  I  =  5,6 
YUB(1,I)  =  l.DO 
40  CONTINUE 

C  Fix  the  start  and  finish  time 

YLB(-1,0)  =  Y0(0) 
YUB(-1,0)  =  Y0(0) 

YLB(1,0)  =  Y1(0) 

YUB(1,0)  =  Yl(O) 


C  Equality  constraints 


ITERM(1,1)  =  1 
ITERM(2,1)  =  1 
ITERM(3,1)  =  0 
ITERM(4,1)  =  -1 
CLB(l)  =  O.DO 
CUB(l)  =  O.DO 

ITERM(1,2)  =  2 
ITERM(2,2)  =  1 
ITERM(3,2)  =  0 
ITERM(4,2)  =  -2 
CLB(2)  =  O.DO 
CUB (2)  =  O.DO 


C  Objective  function 
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ITERM(1,3)  =  0 
ITERM(2,3)  =  1 
ITERM(3,3)  =  0 
ITERM(4,3)  =  -3 

RETURN 

END 


C  MI  constraint  definition 


SUBROUTINE  MID2DE ( IPHASE , T , Z , NZ , P , NP , F , NF , IFERR) 

INTEGER  IPHASE, NZ.NP,NF, IFERR 

DOUBLE  PRECISION  T,Z(NZ) ,P(NP) ,F(NF) ,VH,VHAT,EPSIL 
DOUBLE  PRECISION  A0(2,2) ,A1(2,2) ,B0(2) ,B1(2) ,C0(2) ,C1(2) 
DOUBLE  PRECISION  M0(2,3) ,M1(2,3) ,N0(3) ,N1(3) 


C  Six  ODE  constraints 

C  Variable  list:  x:Z(l)-Z(4),  q:Z(5)-Z(6),  nu:Z(7)-Z(12) ,  y :Z(13)-Z(14) 
PARAMETER  (EPSIL=0 . 7D0) 

INTEGER  R,C 
C  Problem  parameters 

DATA  ((A0(R,C),C=1,2),R=1,2)/-1.D0,1.D0,-1.D0,-3.D0/ 

DATA  ((A1(R,C),C=1,2),R=1,2)/-10.D0,1.D0,-1.D0.-3.D0/ 

DATA  BO/1. DO, 0. DO/, Bl/1. DO, 0. DO/, CO/1. DO, 1. DO/, Cl/1. DO, 1. DO/ 

DATA  ((M0(R,C),C=1,3),R=1,2)/0.D0,0.D0,1.D0,0.D0,1.D0,0.D0/ 

DATA  NO/1. DO, 0. DO, 0. DO/ 

DATA  ((M1(R,C),C=1,3),R=1,2)/O.DO,O.DO,1.DO,O.DO,1.DO,O.DO/ 

DATA  N1/1.D0,0.D0,0.D0/ 

C  First  compute  VH 

VH  =  VHAT(T) 


C 

F(l)  =  A0(1,1)*Z(1)+A0(1,2)*Z(2)+B0(1)*VH+EPSIL*(M0(1,1)*Z(7) 
#+M0(l,2)*Z(8)+M0(l,3)*Z(9)) 


Appendix  A.  Software  Drivers 


161 


F(2)  =  A0(2,l)*Z(l)+A0(2,2)*Z(2)+B0(2)*VH+EPSILt(M0(2,l)*Z(7) 
#+M0(2,2)*Z(8)+M0(2,3)*Z(9)) 

F(3)  =  A1(1,1)*Z(3)+A1(1,2)*Z(4)+B1(1)*VH+EPSIL*(M1(1,1)*Z(10) 
#+Ml(l,2)*Z(ll)+Ml(l,3)*Z(12)) 

F(4)  =  A1(2,1)*Z(3)+A1(2,2)^Z(4)+B1(2)*VH+EPSIL*(M1(2,1)*Z(10) 
#+Ml(2,2)*Z(ll)+Ml(2,3)*Z(12)) 


C  q’ 

F(5)  =  Z(7)>t'*2+  Z(8)**2+  Z(9)**2 
F(6)  =  Z(10)**2+Z(ll)**2+Z(12)**2 

C  Two  algebraic  constraints 

F(7)  =  C0(1)*Z(1)+C0(2)*Z(2)+EPSIL*(N0(1)*Z(7)+N0(2)*Z(8) 
#+N0(3)*Z(9))-Z(13) 

F(8)  =  C1(1)*Z(3)+C1(2)*Z(4)+EPSIL*(N1(1)*Z(10)+N1(2)*Z(11) 
#+Nl(3)*Z(12))-Z(14) 


C  Obj  function  is  a  quadrature 
F(9)  =  (Z(13)-Z(14))**2 

RETURN 

END 


C  Function  to  extract  MEDS  solution  VHAT  for  MI  phase 


DOUBLE  PRECISION  FUNCTION  VHAT(T) 

INTEGER  MAXPHS , NW , MAXCS , MAXDP , IPHASE , lER 

PARAMETER  (MAXDP  =  1000, MAXPHS  =  5,NW  =  10000000 ,MAXCS=100000) 

INTEGER  IPCPH2(MAXPHS+1) ,IPDPH2(MAXPHS+1) 

DOUBLE  PRECISION  CSTAT2 (MAXCS) ,DPARM2 (MAXDP) ,W2(NW) 

INTEGER  NUMZV , LNZVEC , NUMPV , LNPVEC 
PARAMETER (LNZVEC  =  50, LNPVEC  =10) 

DOUBLE  PRECISION  ZVEC ( LNZVEC ) ,PVEC( LNPVEC) 

DOUBLE  PRECISION  T,TZERO ,TFINAL 
COMMON  CSTAT2 , IPCPH2 , DPARM2 , IPDPH2 , W2 

IPHASE  =  1 


CALL  OCSEVL (MAXPHS , CSTAT2 , MAXCS , IPCPH2 , DPARM2 , MAXDP , IPDPH2 , 
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+  W2 , NW , IPHASE , T , TZERO , TFINAL , ZVEC , NUMZV , LNZVEC , 
+  PVEC,NUMPV,LNPVEC,IER) 

VHAT  =  ZVEC (15) 

RETURN 

END 


C  Function  to  extract  MEDS  solution  YEAR  for  MI  phase 
DOUBLE  PRECISION  FUNCTION  YBAR(T) 

INTEGER  MAXPHS,NW,MAXCS,MAXDP, IPHASE, lER 

PARAMETER  (MAXDP  =  lOOO.MAXPHS  =  5,NW  =  10000000, MAXCS=100000) 

INTEGER  IPCPH2(MAXPHS+1) ,IPDPH2(MAXPHS+1) 

DOUBLE  PRECISION  CSTAT2(MAXCS) , DP ARM2 (MAXDP) ,W2(NW) 

INTEGER  NUMZV , LNZVEC , NUMPV , LNPVEC 
PARAMETER (LNZVEC  =  50, LNPVEC  =  10) 

DOUBLE  PRECISION  ZVEC (LNZVEC) ,PVEC (LNPVEC) 

DOUBLE  PRECISION  T, TZERO, TFINAL 

DOUBLE  PRECISION  C1(2),NB1 

COMMON  CSTAT2 , IPCPH2 , DPARM2 , IPDPH2 , W2 

PARAMETER  (NB1=1.D0) 

DATA  Cl/1. DO, 1. DO/ 

IPHASE  =  1 

CALL  OCSEVL (MAXPHS , CSTAT2 , MAXCS , IPCPH2 , DPARM2 , MAXDP , IPDPH2 , 

+  W2 , NW , IPHASE , T , TZERO , TFINAL , ZVEC , NUMZV , LNZVEC , 

+  PVEC, NUMPV, LNPVEC, lER) 

YBAR  =  C1(1)*ZVEC(3)  +  C1(2)*ZVEC(4)  +  NB1*ZVEC(12) 

RETURN 

END 
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A. 4  Analysis  and  Presentation  of  Results 

The  above  SOCS  driver  creates  several  data  files  as  outputs.  These  files  are  suitable 
for  processing  in  MATLAB.  The  following  MATLAB  m-files  process  the  data  files 
outputted  by  SOCS  to  facilitate  analysis  and  visualization  of  results. 


A. 4.1  Detection  Signal  Phase  Processing 

This  m-file  reads  and  plots  the  minimum  energy  detection  signal  and  the  separability 
index. 

'/.portion  of  exl_lto4.in 
7,exl_2_* .  dat 

fid  =  fopen('exl_2_l.dat’ , ’rO  : 

[A21,cnt]  =  fscanf(fid,’7.22g’,[ll,208]); 
f close(f id) ; 

fid  =  fopen(’exl_2_10.dat’ , ’r’) ; 

[A210,cnt]  =  fscanf(fid,’*/.22g>,[ll,394]); 
fclose(fid) ; 

fid  =  fopen(’exl_2_20.dat’ , ’r’) ; 

[A220,cnt]  =  fscanf  (fid, ’7.22g’ ,  [11,642])  ; 
fclose(fid) ; 

fid  =  fopen(’exl_2_100.dat’ , ’r’) ; 

[A2100,cnt]  =  f  scanf(f id, ’7,22g’ ,  [11,800]) ; 
f close(f id) ; 

’/.gamma .  dat 

fid  =  fopen(’gamma.dat’ , ’rO  ; 

[G,cnt]  =  fscanf  (fid, ’*/.22g’ ,  [3,20] ) ; 
f close(fid) ; 

’/.extract  vhat 

vl=A21(ll, :) ;vl0=A210(ll, :) ;v20=A220(ll, : ) ;vlO0=A21O0(ll , : ) ; 

’/.need  to  rescale  the  vhats 

’/.divide  by  the  sqrt  of  the  objective  function 
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7oT=100  case  gave  negative  vhat,  so  resign  it 
vl=vl/sqrt(A21(7,104)); 
vl0=vl0/sqrt(A210(7,197)) ; 
v20=v20/sqrt(A220(7,321)) ; 
vl00=-vl00/sqrt(A2100(7,400)) ; 

’/.extract  time  vector 

tl=A21(l,:);  tl0=A210(l , : ) ;  t20=A220(l , : ) ;  tl00=A2100(l , ; ) ; 

'/.plot  vhats 
plotCtl ,vl) 
plot (tlOjVlO) 
plot(t20,v20) 
plot(tl00,vl00) 

'/.thesis  plots  exl21v.eps,  exl210v.eps,  exl220v.eps,  exl2100v.eps 

"/.combined  plot  of  [0,20],  [0,100]  cases 
plot (t20, v20 ,tl00 , vlOO) 

’/.thesis  plot  exl2v20vl00.eps 

'/.plot  gamma 

T=G ( 1 , : ) ; gamma=sqrt (G (2 , : ) ) ; 
plot (T, gamma) 

’/.thesis  plot  exlgammultT.eps 

’/.rescale  the  vhats  and  the  time  vectors  for  combined  plot  of  all  cases 
vl=vl/max(vl) ; 

vl0=vl0/max(vl0) ;tl0=tl0/max(tl0)  ; 
v20=v20/max(v20) ;t20=t20/max(t20) ; 
vl00=vl00/max(vl00) ;tl00=tl00/max(tl00) ; 
plot (tl , vl ,tl0, vlO , t20, v20 ,tl00, vlOO) 

’/.thesis  plot  exl2multv.eps 

’/.end  of  routine 


A.4.2  Model  Identification  Phase  Processing 

This  m-file  reads  and  plots  the  normal  to  the  separating  hyperplane  and  ybar. 
’/.adex7_2m.m 

fid  =  fopen(’adex7_2_lm7.dat’ , ’r’) ; 
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[Aa71m7,cnt]  =  fscanf  (fid, ’'/,22g’ ,  [15,250] ) ; 
fclose(f id) ; 

*/, extract  yO  and  yl 
yl7=Aa71m7(15, :) ;  y07=Aa71m7(14, : ) ; 

7, compute  yb7  as  the  midpoint  of  segment  {(ybl+yb0)/2]- 
yb7=  (yl7+  y07)/2; 

7,extract  time  vector 
te=  Aa71m7 (1 , : ) ; 

7,construct  normal  to  hyperplane 
a7=  (yl7-  y07)/  L2norm(yl7-y07,te) ; 

7, plot  ybar 
plot(te,yb7) 

7othesis  plot  ae721yb7.eps 

7.plot  a(t) 
plot(te,a7) 

7.thesis  plot  ae721a7.eps 

7.L2norm.m  function  to  compute  L2  norm 
function  x  =  L2norm(f,t) 

7ofind  the  L2  norm  of  a  discretized  function  on  [0,tf] 
7.uses  right  approximation  (as  opposed  to  left/center) 
7.input  vector  function  f 

7.input  time  vector  t  corresponding  to'  values  of  f 
N  =  length (f) ; 
if  N“=length(t) 

error ( ’vectors  must  be  same  length’) 

end 

tr=t(2:N) ; 
tl=t(l:N-l); 
dt=tr-tl ; 
fsq=f (1:N-1) .‘2; 
x=sqrt (f  sq*dt ’ ) ; 


7.end  of  routine 


